Building a Continuous Deployment Engine

A couple of weeks ago, I mentioned that we (the Red Hat Inception Team) are building “a thing.”  Given our own internal interest in the topic, you may be wondering why we chose a custom Release Engine over pre-baked tooling. There are many different reasons why we went in this direction; I’m going to cover four.

Why #1: There are many existing FOSS tools to automate portions of a release process, but there seems to be a tooling gap in tying them easily together.

For CI, there’s Jenkins/Hudson, buildbot, Travis, etc..; for configuration management, Chef or Puppet; for repository management, Pulp or Satellite; for provisioning, Foreman, Satellite, or Cobbler; for host scripting, Ansible as well as our own Taboot. In Red Hat IT, we’re using at least one of these tools in each category. But, there seems to be a lack of easy to use tools to automate orchestration across these systems. Yes, Jenkins can tie them all together, but it’s not easy to set up and even harder to maintain. The Release Engine (RE) is our stab at solving this problem.

Why #2: RE must be light-weight and fit to purpose.

This doesn’t mean that our Release Engine efforts will be restricted to our use cases only, we are trying to make it generic enough that it is reusable by others outside of Red Hat.

Additionally, we already have a few general purpose workflow engines at our fingertips, but we chose not to use them. To be successful, we believe that RE must be easy to extend and modify by our development and operations team members. Each specific workflow tool comes with their own difficulties for making them accessible to the entire department. For example, the prerequisite to set up an Eclipse IDE for a sysadmin to add a new release step would render RE dead on arrival.

In the words of Anderson Silva, the manager of our NA & EMEA Platform Operations team, “the key to the success of this engine is not how many things it can do when it gets released, but how easily people can add functionality to it as demand grows.”

Why #3: RE must be written in a commonly known language. See Why #2.

We chose Python. It’s a popular interpreted language, easy to read and modify and one in which both our development and operations folks frequently have experience. A system that requires Java, C, C++, Mono, etc. to extend would severely limit its accessibility to team members outside the development side of the house, as well as those developers who don’t have time to spend writing traditional strongly typed/compiled code to get their functionality implemented.

Why #4: We need to support a model of decentralized CI, so RE will do the work necessary to maintain a reliable, repeatable and auditable release.  

We will go into the details behind this in the future, but we have already tried centralized CI. As we grew, it became apparent it wasn’t going to continue working for everyone in our department. Developers wanted more control over their CI and having it centralized didn’t give much support to that desire. So, we are trying to unwind our centralized CI tool in favor of developer supported CI environments. No matter what CI solution a team chooses, it should easily hook into RE.

For practical purposes we need consistency in releases and since we won’t be driving that through a central CI system, the release engine will have that job instead. It was unclear that any one tool would what we needed while also doing the following:

  1. Who is doing the release and are they allowed to modify that environment/code/thing? We are all about empowerment, but we’re only going to let people push their team’s code… and definitely not to all environments without fulfilling some prerequisites.
  2. When did the release occur and is it allowed to occur at that time? Sometimes, you need to coordinate code releases so they don’t completely blow up critical business processes. You may say “that’s not CD” – I say; reality of being in an enterprise IT shop. We’ll deal with it if and when we need to.
  3. What and how are you actually deploying? Got a special little thing you do over here in this environment because of “reasons?” Not a thing anymore! We want deployments to be repeatable by everyone, including me… The Product Owner. 🙂

Interested in some of the tools we reviewed?
Here are four we looked at with some thoughts from the team. Kudos to the creators and maintainers of the code; they helped guide some of our design decisions along the way.

Deployinator
What what we like:

  • Language/Framework is Ruby/Rails which many people know.
  • Reusable classes for releases is a good idea.

Why we didn’t choose it:

  • Not a lot of current code updates.
  • README was noted as out of date.
  • Seems to be designed more around checking out code and putting it somewhere which wouldn’t work for us.
  • The authentication seems to be implemented to an in-house system.

Dreadnot
What what we like:

  • Language/Framework is Javascript/Node.js. Almost every engineer knows Javascript.
  • Feature enhancements added within the last few months showing it’s still active.
  • Runs asynchronous.

Why we didn’t choose it:

  • There seems to be regions called out in configs (https://github.com/racker/dreadnot/blob/master/example/local_settings.js#L13) – this may be tied to specific infrastructure.
  • Stacks look like they must be defined and stored on the server side. Another system would need to be in place to support developers modification of deployments.

Strider
What what we like:

  • Language/Framework is Javascript/Node.js. Almost every engineer knows Javascript.
  • Under active development.
  • Strider likes to take common deployment scenarios and make them into reusable recipes.
  • Having reusable stuff is much better than having everyone have custom stuff that ends up to be 95% the same.

Why we didn’t choose it

  • The actual server and runners are all on one machine.
  • It doesn’t seem to delegate out unless you install and configure extensions.
  • Seems to be CI with CD added to it.

Thoughworks – Go
What what we like:

  • Good Flow layout and visualizations
  • Out of the box LDAP integration

Why we didn’t choose it

  • Language is Java, JRuby. While JRuby would be easy for developers the need to add Java + JRuby libraries seems like overkill.
  • Source code was unavailable. (The product was only recently open sourced and is now available).
  • This appears to be a full on replacement for Jenkins and not a system that would run along side of it.
  • Not pluggable as far as we could tell, though we never tried to set up a server as the code was unavailable at that time.
  • The LDAP setup is a bunch of XML and it is unclear whether there is an easy way to use the the GUI to update users.

Join the Red Hat Developer Program (it’s free) and get access to related cheat sheets, books, and product downloads.

 

Share

    1. In honor of transparency and visibility; Rundeck was not one of the tools that we came across during our initial search. When it came up, we discussed it as a possibility… but it was well after we made the decision and were committed to the path to build.
      If we would have come across it first it would have made the decision much harder. Rundeck does look like it supports many of the high level use cases we were thinking about.

      However, it also failed one of our main criteria, which was using a programming language that would be accessible to both developers and system administrators. As I mentioned in my original post, having to use an IDE/Eclipse environment to contribute back to the code base would be a bit of a hard sell to many of our operations folks as well as some of our developers.

      1. Hi, thanks for note about Rundeck. I understand Java/groovy is not everybody’s cup of tea (or island).

        I want to clarify that while the Rundeck server is written in Java, Rundeck steps and workflows are usually made up of shell scripts or other scripting languages. We intentionally designed Rundeck to run *other people’s automation scripts and tools*. It doesn’t require any Java experience of the deployment, ops or development people who use or operate it in order to define repeatable orchestration workflows.

        The built-in steps support shell commands or shell scripts (via any interpreter on the remote system). Also, steps in Rundeck workflows can be supplied via custom plugins, and the plugins themselves do not need to be written in Java. They can defined as scripts and bundled into a simple .zip for installation in the Rundeck server, without restarting it.

        The workflow definitions (sequences of steps) in Rundeck also do not require an IDE, they can be simple YAML or XML files, or can be created directly in the GUI.

        Some of the problems you describe for which we don’t have a direct solution yet are on our roadmap, including steps requiring third-party approval (e.g. via a Jira ticket), and tracking of deployment artifacts.

        We do currently have: robust authorization via ACLs, authn/authz integration with other systems, auditable and repeatable workflow execution, external node/host data, integration with Jenkins, our own API, etc.

        A primary goal of Rundeck is to be tool-agnostic, and toolchain friendly. Our purpose is to connect the dots between dev, ops, and security, and so we try to make it easy to integrate between build/CI, deployment/orchestration, and auth/audit systems and tools.

        Rundeck is under continued, active development, and is Apache 2 licensed. We welcome doc fixes, patches, or pull requests!

        cheers,
        Greg Schueler
        Lead developer – Rundeck

        http://rundeck.org
        http://github.com/rundeck/rundeck
        http://twitter.com/rundeck

      2. In support of Greg’s reply, we use Rundeck with Jenkins and Chef-Solo to orchestrate our development and production systems and are very happy with it. It is very good at providing a multi-node UI/CLI to anything which means we can integrate the right tool for the right problem.

        Never once touched a line of Java. Use plenty of python/ruby/bash.

    1. Hey Mark –
      I believe you took the comment out of context. I’m pretty sure the engineer in question has set up a server before. 😉 However, we specifically did not set up Thoughtworks Go because the code was not open source at the time we were looking.

  1. Leroy is a freeware software deployment engine that can be integrated into any build system.
    More information and documentation about Leroy is available at http://www.leroydeploy.com.
    The purpose of the Leroy-Jenkins plugin is to help integrate Leroy’s deployment functionality
    and configuration management with Jenkin’s presentation and access control abilities. This
    allows one to create a web based application deployment dashboard granting software, system
    and devops engineers to work together to bring automation and consistency to deployments using
    a simple xml format stored in SCM.

    … more: continuous deployment tools

Leave a Reply