Gene Kim and Red Hat IT Part 4: DevOps Successes and Failures

Part four of a four-part series on DevOps with Gene Kim and Red Hat IT.

Part 1: Getting DevOps Off the Ground
Part 2: The Importance of Partnership in DevOps
Part 3: A DevOps Implementation Strategy
Part 4: DevOps Successes and Failures

Panelists:

Bill Montgomery: Manager, Red Hat IT
Steve Milner: Engineer, Red Hat IT
Jen Krieger: Product Owner, Red Hat IT
Tim Bielawa: Engineer, Red Hat IT
Chris Murphy: Engineer, Red Hat IT

Introduction: Gene Kim, award-winning CTO and co-author of “The Phoenix Project,” recently sat down with Red Hat IT’s Inception team to discuss their DevOps mission. Here are the highlights from the conversation.

On early successes and failures for Red Hat IT’s DevOps Enablement team:

Gene: What are you framing as your first couple wins?

Jen: While we were gathering information on what our long-term focus should be, the team also wanted to work on something that would give us a few early wins. We wanted to build confidence and put something out there that people could use as soon as possible.

Tim: I'm the former release engineering guy here. In my old job, every single day, somebody would ask, "Hey man, you got a minute? Can you tell me how much memory is running on this machine?" Hold on, let me look. "Can you tell me what the secondary IP is on this host?" Give me a minute. “Can you tell me what the value of this puppet fact is?” Et cetera, and so on, and so on. It was a huge time sink for developers and ops folks.

So, we wrote a modular tool, jsonstats. It's pluggable, you throw whatever you want in. As long as your plugin can return a hash data structure, then there's a REST endpoint that's listening and returns that data. We deployed it across our environments. Steve put together a web front-end, talook--you just type in the hostname, and boom. There's the information you need. It saves people tons of time now.

Gene: Can you tell me a little bit more about the value you've created? Some more concrete examples of ways you've saved people time?

Tim: With jsonstats and talook? Reduced context switching. In any given day, you’re saving a few hours, at least, across the release engineering and sysadmin teams by not having to constantly refocus. The time it takes to look up a fact is trivial. But enabling people to find the data themselves through a web interface, yes, that’s saving hours per week.

Gene: I'm guessing that this has also led to increased consistency of environments as well, just the fact that you can actually answer those questions.

Chris: It's shining the light on where and how the environments are not consistent because we suddenly have hundreds of more people able to access the information. By giving the developers this access, they can now very quickly and easily find the disparities between environments and collaborate with our system administrators to bring the environments in line with each other. So while it isn't an overnight fix, it's starting that movement.

Gene: Awesome. What has been people’s reaction when suddenly they are getting answers to questions that they've always been asking?

Steve: It's been positive. We even had somebody in a dev team basically ask us to hurry up and put out some enhancements to jsonstats and talook. That was cool, to have somebody on our tail going, "Hey, guys. Just take the time to do this."

Bill: The biggest risk for this team is irrelevancy and lack of demand. We don't have any authority to push changes top‑down. All we can do is use a carrot approach for people to buy what we're selling. And today, we've got people knocking on the door, asking for more. That's a great indication.

Gene: What has your biggest challenge or setback been to date?

Jen: People in general? [laughs] Seriously though, rapid prototyping and building things that might get tossed out in a matter of weeks/months can be uncomfortable for everyone involved. For example, we quickly built jsonstats & talook to expose production server configurations to developers with the knowledge that our systems engineering team was going to implement MCollective in the not-too-distant future. That did create conflict inside and outside our team. The idea of prioritizing “done” over “perfect” is a challenging new idea for us as a team and a department.

We are also a new agile team, which comes with all of the challenges that agile teams have while forming. It took us awhile to become comfortable to speak up about technical concerns; it’s a constant balance between needing to speak about technical refactoring vs. how many meetings can people tolerate. I believe we have found the right balance and I also believe the team has figured out that clearing up defects, technical debt and focusing on refactoring are as important to me as a Product Owner as feature development.

Gene: What’s your next big win and what's the time frame for that?

Jen: That would be Release Engine. We are targeting automated code deployment for our SOA & ESB team to our QA environment next week. That date is specifically to assist with two significant projects running in IT right now. We expect Release Engine to start decreasing change lead time and identifying our environmental inconsistencies that bite us over and over again.

Bill: After July, we expect to expand adoption of Release Engine to more teams over the following 2-3 quarters. Once we get it right, other dev teams will be chomping at the bit, and we’ll actually be relieving pressure on our operations folks. That’s when we’ll start to see a department-wide impact and broader use of CI/CD practices.

Gene: Thank you so much, guys. I will catch up with you soon.

This concludes this four-part interview series with Gene Kim and the Red Hat Inception team. Stay tuned to Red Hat Developer Blog for more updates on DevOps in Red Hat IT.

Gene Kim is hosting the DevOps Enterprise Summit on October 21-23, where more stories will be told about DevOps transformations in large, complex organizations. Learn more about the summit and submit your own talk here!

Last updated: June 20, 2023

Gene Kim and Red Hat IT Part 4: DevOps Successes and Failures

Using NetworkManager to permanently set an interface administratively down

MPI-powered gradient synchronization in PyTorch distributed training

llama.cpp vs. vLLM: Choosing the right local LLM inference engine

How speculative decoding delivers faster LLM inference

What's New in Red Hat Developer Hub 1.10?

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links