Kubernetes

A stylized illustration representing an artificial neural network, set against a dark purple background within a slightly rounded, darker purple square icon shape. The neural network consists of multiple layers of interconnected nodes, depicted as glossy, spherical red orbs. Lines connect these red orbs, forming a complex web. White arrow shapes extend horizontally from the left side, pointing towards the network, suggesting input or data flowing into the system.
Article

Combining KServe and llm-d for optimized generative AI inference

Ran Pollak +1

Learn how to combine KServe and llm-d to optimize generative AI inference, improve performance, and reduce infrastructure costs. This article demonstrates the integration architecture and provides practical guidance for AI platform teams.

Kubernetes + OpenShift featured image
Article

Red Hat build of Kueue 1.3: Enhanced batch workload management on Kubernetes

Kevin Hannon

Explore new features in Red Hat build of Kueue 1.3, including integration with JobSet for efficient batch job scheduling, support for LeaderWorkerSet for distributed ML workloads, and the introduction of v1beta2 APIs. Learn how to get started with the updated Kueue operator.

Video Thumbnail
Video

Deploying open source AI agents on OpenShift using OpenClaw

Grace Ableidinger +1

Learn how to run OpenClaw on Red Hat OpenShift with production-grade security and observability. We cover default-deny network policies for blast radius containment, container-level sandboxing with OpenShift, Kubernetes Secrets for credential management, and end-to-end OpenTelemetry tracing with MLflow, so every decision your AI agent makes is isolated, auditable, and safe by default. Whether you're a developer exploring AI agents for the first time or a platform engineer thinking about running agentic workloads at scale, this is the infrastructure story that makes it production-ready.

Feature image for Red Hat OpenShift
Article

Blast radius validation: Large and small Red Hat OpenShift nodes

Chris Janiszewski +1

This article evaluates the impact of deploying larger, higher-density "monster" servers on blast radius failure recovery time compared to smaller nodes in Red Hat OpenShift and Kubernetes platforms. The testing focuses on validating real-world architectural concerns, including whether higher core density increases operational risk, whether evacuation and recovery times are worse with larger, higher core-count nodes, and whether blast radius is driven by node size, or by imbalance of compute, storage, and networking performance.

Featured image for Red Hat OpenShift AI.
Article

Run Model-as-a-Service for multiple LLMs on OpenShift

Vladimir Belousov

Learn how to deploy multiple large language models (LLMs) behind a single OpenAI-compatible endpoint on OpenShift using a Model-as-a-Service (MaaS) approach. This guide demonstrates how to build an intelligent routing infrastructure that dynamically inspects the request payload and directs traffic based on the specified model field, reducing GPU waste and simplifying application logic.

Feature image for Red Hat OpenShift
Article

Integrate Red Hat Advanced Cluster Management with Argo CD

Francisco De Melo Junior

Learn how to integrate Red Hat Advanced Cluster Management with Argo CD for efficient application control. Discover how to use both push and pull models, and configure Argo CD to watch Policy resources.