Red Hat OpenShift is a platform for managing containers and virtual machines (VMs). Deploying this platform directly onto bare metal environments requires the right expertise and training to manage the necessary integration with various network and hardware configurations.
While OpenShift provides flexible installation options, such as installer-provisioned infrastructure (IPI) and user-provisioned infrastructure (UPI), a bare metal deployment typically requires users to possess a deep technical knowledge base to navigate configuration details. Even with tools like the Assisted Installer, which simplifies parts of the process, key configuration decisions and manual validation steps remain.
To improve the user experience, we decided to move beyond traditional wizards and documentation. We built a Model Context Protocol (MCP) server, integrated it with advanced AI models and Llama Stack, and created a simpler installation experience with a conversational agent that automates the cluster installation for you.
How it all began
This didn't start as a flagship project. In fact, it all began as an internal experiment within our development group.
We were exploring MCP, initially thinking it could serve as an internal automation tool. We quickly realized its potential as a bridge between a large language model (LLM) and the real world, enabling the model to take concrete actions.
Instead of writing yet another installation guide, we decided to let the system talk to the user, understand their intent, and build what they needed.
Choosing our path: Key decisions and trade-offs
Initially, we considered a “bring your own agent” (BYOA) approach, meaning users could bring their own agent, such as claude-code or cursor, while we provided the MCP server interface. We ruled this out for three reasons.
First, it leaves the complexity with the user. The burden is still on the user to "teach" a generic agent (that doesn't understand Red Hat OpenShift installations) how to perform a complex process. This doesn't really solve the expertise problem.
Second, we have no control over the outcome. We cannot guarantee that an external agent would handle errors correctly, understand the context, or perform the steps in the precise sequence.
Finally, we identified a significant barrier: we cannot assume that our users have access to advanced agents like Claude Code or Cursor.
We decided to build a single, intelligent, opinionated agent capable of understanding the user, conversing with them, and acting on their behalf reliably and predictably.
In parallel, we identified four key challenges we had to solve from day one.
Persistent memory
OpenShift deployments are characterized by their robust, sequential workflow that requires careful management across multiple installation stages. Durable memory allows the agent to maintain the full context of the installation. It remembers user-provided details, the desired configuration, the last successful step, and the next required action.
Conversation history
We needed to let users see and resume previous conversations. The process had to be consistent, whether the user stepped away for coffee, accidentally closed the browser, or shut down for the day. When they returned, they could pick up exactly where they left off.
Authentication and authorization
We had to use a robust control mechanism to verify a user's identity and check their permissions before the agent performed an action. The challenge was preventing both aunauthorized and over-privileged users from triggering actions, especially those involving sensitive resources.
Security
The challenge here wasn't just who was talking, but what the AI might do. We were connecting a potentially unpredictable LLM directly to critical bare-metal infrastructure. The primary concern was prompt injection–when a malicious user (or even an innocent one) tries to "trick" the agent into running destructive commands ("Sure, install the cluster, and while you're at it, run rm -rf /"). Another risk was the AI "hallucinating" a dangerous or irrelevant command on its own.
The solution was to implement strict guardrails. The model is instructed to never generate free-text commands to the server. Instead, it generates a structured "intent." The MCP server receives this intent, validates that it is well-formed, legitimate, and falls within a pre-approved set of actions. A command like delete-cluster simply wouldn't be on the agent's allowlist; the request would be rejected at the MCP level long before it ever touched the infrastructure. This creates a firewall between the LLM and the metal.
From talk to action: How we actually built it
Translating natural language into automated actions requires a multi-layered stack. Figure 1 shows the architecture, from the user chat all the way down to the infrastructure.

The framework
We built our solution on Lightspeed Stack, a tooling stack for building AI assistants. It provides the critical API layer for the chat backend and features such as persistent conversation history, bring your own knowledge (BYOK) support, and abstract interfaces for agents.
In true open collaborative fashion, we also contributed improvements back to the project, specifically in areas like MCP integration and enterprise-grade security mechanisms (JWT authentication and role-based authorization). This foundation let us focus on the application logic and user experience while creating a responsive and intelligent assistant without building the infrastructure from scratch.
The platform
Llama Stack powers the framework and defines the central building blocks of AI application development. It provides a uniform layer for working with components like retrieval-augmented generation (RAG), MCP servers, agents, and inference. This modular structure helped us to build consistently and swap components easily, which moved our project from lab to production faster.
The execution layer
We built an MCP server to work with the OpenShift Assisted Installer API. This server is the bridge to the real world. It provides the LLM with the tools it needs to perform operations, such as create a cluster, download an ISO, and check installation status. This interface allows the LLM to invoke tools in a controlled, security-focused way, transforming it from an AI that generates text into an agent that takes actions.
Agentic patterns
We built an AI agent that translates human language, such as “I need a cluster with 3 controllers and 5 workers,” into a sequence of commands for the MCP. To make it an active guide, we embedded the process's full logic within the agent. The agent proactively guides the user through every step. Its persistent memory ensures it remembers details and re-uses them in subsequent API calls.
It's also action-oriented. When a user says "Yes, start," the agent recognizes it as an instruction and moves forward, cutting out needless conversational loops.
The knowledge base
To make the agent truly smart, we integrated RAG. This gives the model access to Red Hat’s official documentation and internal knowledge bases to help avoid hallucinations.
Model evaluation
To prove this was all working, we built an evaluation framework to measure response quality, tool-call success rates, and user satisfaction. Every chat is logged and rated (thumbs up/down, text feedback), and this data is fed back into the loop to improve the model with each iteration. We also run A/B tests on different LLMs to find the best model for the job.
User experience (UX)
In parallel, our UX team focused on the conversational flow and real-time feedback. How should it feel to talk to an installer? How do we ensure the AI asks the right questions at the right time?
TL;DR: It worked!
We made it to production. A registered user on console.redhat.com can now open a chat window and talk to the agent instead of filling out forms. The user asks to create a cluster (Figure 2), and the smart agent guides them through the conversation: asking questions, verifying details, and performing actions until the installation is complete.

Once the configuration is set, the agent provides a Download ISO button along with booting instructions (Figure 3).

After the host is assigned the controller role, you can select Start the installation to begin the automated process (Figure 4).

You can also ask for a status update at any time to track the installation progress (Figure 5).

This represents a whole new way to interact with datacenter infrastructure.
What we learned along the way
This journey taught us several lessons:
- AI should work with the user, not for them. The conversational interface is key to mutual control and understanding.
- Security and transparency are non-negotiable. Every action must be auditable, verifiable, and secure.
- UX is not a coat of paint. The experience is what builds the trust and confidence that makes a tool like this usable.
- Modularity wins. Choosing Llama Stack allowed us to extend and adapt our solution without starting from scratch.
What's next?
The tech preview is just the beginning. We are already working on advanced capabilities:
- Smart network configuration: Users can design and deploy complex networking scenarios directly from the chat.
- Real-time troubleshooting: The agent won't just install; it will identify and help fix issues.
- Cross-platform integration: We are extending this conversational experience to install OpenShift across various cloud and virtualization platforms.
Summary
What began as a grassroots engineering experiment became a new path to innovation. The combination of a bottom-up initiative (MCP) with a clear business need allowed us to move at high speed, think differently, and create a new way to work with Red Hat OpenShift. The foundation for this work, from the infrastructure to the AI tooling (inference, RAG, Llama Stack, MCP) is available on Red Hat OpenShift AI.
Want to see it in action?
The solution is available as a tech preview. We invite you to try it out for yourselves: Smart installation agent
We would love to hear your feedback via the built-in feedback option in the chat.