Deploying agents with Red Hat AI: The curious case of OpenClaw

AI agents and assistants share operational needs that typical web services do not have. LangGraph agents, CrewAI agent crews, custom assistants, and OpenClaw all hold API keys, maintain session state, call tools, execute code, and make decisions on behalf of users. They communicate with large language models (LLMs) that incur per-token costs. They might run safety checks against every message. They need identity, not just authentication.

Red Hat AI addresses these problems at the platform level. It handles model serving, safety guardrails, inference routing, agent identity, and supply chain security before you write your first agent config.

We deployed OpenClaw to put this to the test. OpenClaw is an open source personal AI assistant that runs on your infrastructure, connects to model providers, integrates with messaging platforms, and provides a web interface to interact with your agent. We chose it because it showcases how to get the most out of the Red Hat AI stack for reliable agent deployment: model inference, safety guardrails, agent identity, and persistent state. The patterns here apply to any agent workload you bring to the platform.

This article explains what the Red Hat AI platform provides and how we put it to work. This is the second post in a series covering how to operationalize AI agents with Red Hat AI and the OpenClaw project. Catch up on the other parts in the series:

Part 1: Operationalizing "Bring Your Own Agent" on Red Hat AI, the OpenClaw edition
Part 2: Deploying agents with Red Hat AI: The curious case of OpenClaw
Part 3: Every layer counts: Defense in depth for AI agents with Red Hat AI
Part 4: Testing infrastructure red teaming with abliterated models

Model connectivity: Three paths to inference

Agents need LLM inference. You can call a hosted API, but that means sending every prompt off-cluster, paying per token, and trusting a third party with your data. For regulated environments or cost-sensitive workloads, you want options.

Red Hat AI gives you three: vLLM, Llama Stack, and Models-as-a-Service (MaaS).

vLLM

Now in general availability as part of Red Hat AI, vLLM provides a direct way to serve models on your cluster. You can serve a model on your cluster by using KServe and pointing your agent to the /v1/chat/completions endpoint. KServe handles GPU scheduling and scaling. This path offers full control over a single, self-hosted model.

We deployed Llama 3.2 3B Instruct on an A10G GPU this way, and OpenClaw was talking to it within minutes.

Llama Stack

Available in technology preview as part of Red Hat AI, Llama Stack provides a unified API that simplifies inference routing and multiturn conversations. It provides chat completions across multiple backends (vLLM, OpenAI, Anthropic), retrieval-augmented generation (RAG) APIs (file_search, vector stores), and an implementation of the stateful OpenAI Responses API for multiturn agent conversations.

Deploy it using the Red Hat OpenShift AI operator to get inference, retrieval, and state management through a single endpoint. You can swap between self-hosted and remote inference by changing the model parameter:

# Self-hosted (vLLM via Llama Stack)
model: vllm-local/llama3-2-8b

# Remote (OpenAI via Llama Stack)
model: openai-hosted/gpt-4o-mini

Because the endpoint and API remain the same, the agent does not know which backend handles the request.

Models-as-a-Service (MaaS)

Available in technology preview in Red Hat AI, Models-as-a-Service (MaaS) is a managed model-serving platform. It includes built-in API key management, rate limiting, and policy enforcement through Gateway API and Kuadrant. Models are served through KServe with an API gateway in front, providing OpenAI-compatible endpoints with production controls (such as authentication, quotas, traffic routing) out of the box.

All three options expose standard OpenAI-compatible APIs. Your agent connects in the same way regardless of the path you choose.

Agent identity and zero trust

Agents call other services, such as LLMs, tools, databases, andother agents. Most of these calls use long-lived API keys with broad permissions. There is no standard way to declare that a Deployment is an agent, scope its access, or verify its identity when it calls a downstream service.

Kagenti addresses this with two layers: an operator for agent lifecycle visibility and AuthBridge for zero trust service-to-service authentication. Kagenti is planned as part of Red Hat AI in the second half of 2026, with a preview coming soon.

AgentRuntime

The AgentRuntime custom resource definition (CRD) binds an agent's operational configuration to its workload. You declare "this Deployment is an agent," and the controller takes over. The controller resolves the target workload, such as a Deployment or StatefulSet), and computes a config hash from a three-layer merge of cluster defaults, namespace defaults, and CR overrides.,

It applies that hash to the pod template to trigger rolling updates when the configuration changes. The controller also tracks the runtime phase (such as Pending, Active, or Error) using structured conditions. It also watches for changes to the target workload and to cluster/namespace ConfigMaps, so config changes at any level automatically reconcile the agent's pods.

The CR includes per-agent overrides for tracing (OpenTelemetry endpoint, protocol, and sampling rate) and identity (SPIFFE trust domain):

apiVersion: agent.kagenti.dev/v1alpha1
kind: AgentRuntime
metadata:
  name: openclaw
  namespace: openclaw
spec:
  type: agent
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: openclaw
  trace:
    endpoint: "mlflow-service.mlflow.svc.cluster.local:4318"
    protocol: http
    sampling:
      rate: 1.0
  identity:
    spiffe:
      trustDomain: "example.com"

AgentCard

The AgentCard provides the agent's metadata, including its capabilities, endpoints, and protocols. Together, AgentRuntime and AgentCard give the platform visibility into what agents are running, where, what they can do, and how they are configured.

AuthBridge

AuthBridge is the identity layer. AuthBridge provides transparent token management for agent workloads through sidecars injected by the Kagenti webhook:

client-registration: Automatically registers the agent as a Keycloak client using its Secure Production Identity Framework for Everyone (SPIFFE) ID. No manual client configuration or static credentials.
AuthProxy: An Envoy-based proxy and external processor that intercepts inbound and outbound traffic. It validates JSON Web Tokens (JWT) tokens for inbound traffic and exchanges the caller's token for one scoped to the target service for outbound traffic.
SPIFFE Helper: Provides the agent's workload identity (SVID) from SPIRE.

When Agent A calls Agent B, the token is automatically exchanged for one scoped to Agent B's audience. The application code does not change. The sidecar handles validation, exchange, and credential rotation transparently. Static API keys are replaced with short-lived, audience-scoped JWTs. Each agent gets its own identity and can only call services it has been authorized to reach.

The AgentRuntime controller is being developed further to integrate natively with AuthBridge (currently a Kagenti extension) and other secure identity and sandboxing solutions, so that declaring an agent workload automatically provisions its identity, scopes its access, and enforces its security boundaries. More to come on this in future posts and platform releases.

Platform security: What OpenShift enforces by default

On vanilla Kubernetes, containers can run as root, hold ambient credentials, and accept unauthenticated traffic. OpenShift prevents all three by default:

Security Context Constraints (SCCs): Every container runs as a random non-root UID with all capabilities dropped. You do not need a custom SCC for agent workloads.
Built-in OAuth: An oauth-proxy sidecar authenticates users against the OpenShift OAuth server without requiring an external identity provider. If you can run oc login, you can access your agent.
Automatic TLS: Routes terminate TLS by using the cluster wildcard certificate. WebSocket upgrades work natively.

Deploy OpenClaw

To deploy OpenClaw, you must first ensure your environment meets the following requirements.

Prerequisites

An OpenShift cluster where you can create a namespace; cluster-admin privileges are not required.
oc CLI authenticated (oc login)
An API key or endpoint URL for a model provider

A note on storage: OpenClaw uses SQLite for its agent memory index, which requires POSIX file locking via fcntl(). Block storage classes (such as gp3-csi on AWS, managed-csi on Azure, or thin-csi on vSphere) work correctly. Avoid NFS-backed storage classes.

Deploy with the openclaw-installer

The openclaw-installer is a community-supported utility that automates deployment. It generates standard Kubernetes manifests, detects OpenShift, and automatically adds OAuth proxy integration:

git clone https://github.com/sallyom/openclaw-installer.git
cd openclaw-installer
npm install && npm run build && npm run dev

Open http://localhost:3000, fill in the deploy form (agent name, image, API key), and click Deploy. The installation takes about two minutes, primarily for the container image pull. When the installation is complete, the installer prints the Route URL with a preloaded gateway token.

What gets deployed

The installer creates a dedicated namespace that includes the following resources:

Resource	Purpose
Namespace	An isolated namespace labeled for installer discovery.
ServiceAccount	A service account for the `oauth-proxy` that includes an OAuth redirect annotation.
Secrets	Secrets that store the OAuth configuration, gateway token, and model provider API keys.
ConfigMaps	Configuration maps for the agent configuration file (`openclaw.json`) and workspace files, such as `AGENTS.md` and `SOUL.md`.
PVC (10Gi)	All persistent state, including session transcripts, agent memory, and configuration.
Deployment	A pod that includes an `init` container, an `oauth-proxy` sidecar, and the OpenClaw gateway.
Service + Route	A TLS-terminated route that targets the `oauth-proxy`.

The Deployment runs a single pod with three containers: an init container that configures the gateway, an oauth-proxy sidecar that handles authentication, and the OpenClaw gateway. All three run under the default restricted-v2 SCC without requiring modifications.

For the full YAML of each resource, see the example manifests.

Access your instance

After deployment, open the Route URL printed by the installer. You will be redirected to the OpenShift login page. After authenticating, the gateway token is included in the URL, so no manual copy-paste is needed.

To customize your agent, edit the workspace files locally (AGENTS.md for instructions, SOUL.md for personality, IDENTITY.md for who the agent is) and click Re-deploy in the installer's Instances tab.

What is supported vs. what is community tooling

The following table summarizes the support status for the components used in this deployment, distinguishing between enterprise-ready Red Hat products and community-supported projects.

Component	Source	Status
Red Hat OpenShift	Red Hat	Supported product
Red Hat OpenShift AI (vLLM, KServe, TrustyAI, model serving)	Red Hat	Supported product
Llama Stack (via the Red Hat OpenShift AI operator)	Red Hat / upstream	Supported in Red Hat OpenShift AI
Kagenti operator	kagenti.dev	Open source, upstream. Planned for Red Hat AI 2H 2026 (preview soon).
OpenClaw	openclaw	Open source, upstream
`claw-installer`	sallyom/openclaw-installer	Community utility

Take the next step with OpenClaw and Red Hat AI

Moving operational needs like identity, guardrails, observability, and hybrid inference to the platform level lets you focus on building your agent's logic rather than its infrastructure. Start by experimenting with the Red Hat AI platform and stay tuned for more on deploying and managing agentic AI.

Learn more:

Last updated: June 2, 2026

Deploying agents with Red Hat AI: The curious case of OpenClaw

"Bring Your Own Agent" on Red Hat AI

Model connectivity: Three paths to inference

vLLM

Llama Stack

Models-as-a-Service (MaaS)

Agent identity and zero trust

AgentRuntime

AgentCard

AuthBridge

Platform security: What OpenShift enforces by default

Deploy OpenClaw

Prerequisites

Deploy with the openclaw-installer

What gets deployed

Access your instance

What is supported vs. what is community tooling

Take the next step with OpenClaw and Red Hat AI

Red Hat build of Quarkus 3.33: Stability and performance advancements for enterprise Java

Batch inference on OpenShift AI with llm-d: Architecture, integration, and workflows

Upgrade RHEL with leapp

Kafka Monthly Digest: June 2026

Build a multi-agent supervisor pattern on Red Hat AI

Download, serve, and interact with LLMs on RHEL AI

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links