This is a brief story about DevOps at Red Hat IT in the near future. It's fiction--all the individual and team names are fabricated--but it's grounded in very real and typical circumstances.The goal is to describe some of the business capabilities one can expect after making some moderate investments in DevOps, and then debrief on what investments were made to be able to tell this success story. It aims to be relatable to anyone working in corporate IT, technical or otherwise.
Note, some of the capabilities described here will be enabled by the work in progress by Red Hat IT on Release Engine (github link). When we refer to "Release Engine" in the story this is what we're talking about; it's the one name that's not fictional.
~~~
“The Deal Closer”
Background: It’s late summer 2014. Elin is Product Owner (PO) on a team in Red Hat IT that develops line-of-business apps for Red Hat’s Global Awesomeness department. Elin’s team has been gradually implementing CI and CD practices for about 3 months, biting off one small chunk each sprint. They’ve leveraged the best practices and tooling published by the Inception team, which are now officially supported by Production Operations.
Today, Elin’s team has over 60% unit test coverage on their web apps (Java on JBoss), a handful of integration tests, a working CI server (Jenkins) that runs their builds and test automation suites on every git check-in, and they use the Release Engine to orchestrate deployment of their apps. For simpler releases that don’t require any new infrastructure or database changes, her team’s releases are fully automated.
Friday, 9:07am - Elin receives an urgent email from Greg, Director of Global Awesomeness Operations. A huge partnership opportunity is on the table, but the partner needs to understand more about how Red Hat’s customers are using its products today. This would require Elin’s team to start passing a new piece of information to their marketing analytics platform every time a user activates a subscription.
9:14am - The business need, urgency, and requirements are crystal clear. So, as the PO, Elin decides to disrupt her team’s sprint commitments and jump on this opportunity immediately. She drafts a clear, concise user story that one of her senior developers, Laura, picks up right away.
9:58am - Krish, her team’s senior QE, checks in a unit test for the new feature. Then, Laura writes the code to pass the test and pushes her git commit. This is a simple new feature and only required a few lines of code to test and write.
10:13am - Laura is pinged in IRC by the Jenkins system--the 10 lines of code she checked in 15 minutes ago broke an integration test with the CRM! Laura quickly realizes her simple mistake and pushes the one-line fix.
10:32am - Elin sees the build is once again “green” on their team’s information radiator, following Laura’s quick mistake and subsequent fix. She checks that the feature is working as expected in her team’s integrated Dev environment, to which the Release Engine has already automatically deployed. Looks good! So, she gives the Release Engine the command to push to QA. The Release Engine opens a change record (CHG) in the ITIL change management system with all the details, and the Change Manager and Production Operations team is notified of a change going into a managed environment.
10:51am - Elin gets “all green” from her QA push, so she tells the Release Engine to push to Stage.
11:16am - Elin gets an “all green” response from the Stage push. But, her team is still relatively new to CD, and their automation tools haven’t yet stood the test of time. So, she asks Krish, her QE lead, to poke around and make sure things look good before pushing to Prod.
11:29am - Krish tells Elin everything looks fine in Stage. So, Elin commands the Release Engine to push to Prod. The Release Engine fires off notifications to the teams and individuals subscribed to notifications for the project via IRC and email. The Prod push goes: the Release Engine rolls nodes, runs tests, and finally closes the CHG record as “Successful.”
11:50am - Elin and Krish spot-check the new feature in Prod. Looks great! Laura had dropped in a line of code to instrument the response time of the new feature in Graphite. She checks the graphs and it’s running as expected in Production.
11:57am - Elin calls Greg in Global Awesomeness to tell him the subscription activation information is now flowing into the marketing analytics platform. Greg can’t believe this is already working--new features are never this quick to implement!
12:03pm - Greg sends Elin’s team and her manager a thank-you note for their lightning-fast turnaround. This new feature will arm the Global Awesomeness team with extremely valuable data to help them close a critical partnership deal next week.
By investing a few hours a week for the past three months in their QA, CI, and CD practices, Elin’s team was able to wow their business partner, and deliver at a speed that helped close a major deal. Without these practices in place, Elin’s team would have taken days or weeks to get this 10-line feature in with their next big release; who knows how the partnership deal would have gone?
~~~
Debrief
What were the elements in place that allowed this quick, smooth turnaround, and, ultimately, for the partnership deal to be closed?
- Automated testing. First, Laura was notified within minutes when her code check-in broke the CRM integration. Second, the team's decent unit test coverage gave Krish and the team confidence in the quality of the release, even without a manual testing cycle.
- Automated deployments. Automating the release process did two things: 1) reduced time required to actually execute the release, but more importantly 2) increased confidence in the release process, by driving out human error factors.
- Trust. The rigor and automation employed in testing and deployment supports higher trust. The Production Support team, who until recently did all these deployments by hand, now trusts the dev team to push their own code to Prod. So does the Change Manager, who, in ITIL parlance, has made this team's releases a Standard Change, bypassing Change Board approvals.
- Feedback loops. All throughout this story, there were tight feedback loops. First, when Laura broke the build, Jenkins pinged her immediately. Next, when the QA and Stage pushes passed deployment tests, Elin was notified. Finally, as soon as the release hit Prod, Laura checked Graphite for the monitoring line she dropped in to verify response times were as expected. These feedback loops raised everyone's confidence throughout the process.
Is this DevOps nirvana? No. Automated test coverage could be more comprehensive, more complex releases could be automated, and these practices could be applied across more teams in the organization. And, the story said nothing to demonstrate a culture of experimentation and learning, or how things are handled when they go sideways (learning vs. blaming). But, this fictional team is well on their way to being better able to achieve their goals with their partners on the business side of things.
Today, DevOps is a journey to an ideal. It is not clearly defined as a set of practices, procedures, and tools. It may get there, it may not--we're too early in the hype cycle to tell. Either way, embarking on the journey towards DevOps will at least get IT teams talking to one another, and that has to leave your IT org in a better place.
What are your next steps? If you aspire to get your IT teams to a place where this story reflects your reality, here are a few things you can do:
- Invest in QA automation & Test-Driven Development (TDD) practices. None of this works without a basic foundation of automated testing.
- Find a local meetup on QA automation. Here in Raleigh, NC, USA we have TISQA.
- If you have a CI server like Jenkins already, make sure you’re using it, and your builds work. Once they do, hold your teammates accountable for breaking the build. Nobody leaves a broken build!
- Work with your ops teams to increase the automation and repeatability of the software release process. They will appreciate you for this!
- Read the book Continuous Delivery by Humble and Farley which does establish clear patterns for CI and CD--practices that are hugely supportive of DevOps. Read the intro and then skip ahead to Chapter 15, and consider the maturity model set forth in there.
- Follow the Release Engine re-* projects on github (very much a work in progress at the moment)
- If you've already achieved the high-confidence flow of working software described here, good on you--share your story! We know you're out there, and the IT industry needs to hear how you did it.