Red Hat OpenShift AI

Featured image for Red Hat OpenShift AI.
Article

OpenShift AI observability summarizer: Transform metrics into meaning

Twinkll Sisodia +4

Learn how the Red Hat OpenShift AI observability summarizer transforms raw time-series data from Prometheus into actionable, human-readable insights for platform teams. Discover the five-layer pipeline architecture and how it reduces noise and increases signal for a focused answer.

Featured image for agentic AI
Article

3 lessons for building reliable ServiceNow AI integrations

Tomer Golan

Learn about critical lessons from building an MCP-powered AI agent for ServiceNow, including how to structure testing environments, best practices for implementing safeguards, and a phased approach to deploying enterprise AI integrations.

Featured image for agentic AI
Article

Deploying agents with Red Hat AI: The curious case of OpenClaw

Nati Fridman +2

Explore how Red Hat AI simplifies agent deployment with OpenClaw, showcasing model inference, safety guardrails, agent identity, and persistent state. Learn about vLLM, Llama Stack, and Models-as-a-Service (MaaS) options, and discover the benefits of agent identity and zero trust with Kagenti and AuthBridge.

Featured image for agentic AI
Article

Distributed tracing for agentic workflows with OpenTelemetry

Fabio Massimo Ercoli

Learn how to set up distributed tracing for an agentic workflow based on lessons learned while developing the it-self-service-agent AI quickstart. This post covers configuring OpenTelemetry to track requests end-to-end across application workloads, MCP servers, and Llama Stack.

ai-ml
Article

Vibes, specs, skills, and agents: The four pillars of AI coding

Rich Naszcyniec

Explore the four pillars of AI coding: vibes, secs, skills, and agents, and learn how they can improve the coding quality and reduce the encoding/decoding gap. Discover the benefits of a spec-driven approach and the importance of modular specs and skills in achieving harmony.

Featured image for vLLM interference article.
Article

Integrate Claude Code with Red Hat AI Inference Server on OpenShift

Alexander Barbosa Ayala

Learn how to integrate Anthropic's Claude Code, an agentic coding tool, using Red Hat AI Inference Server on OpenShift. Keep the inference process private on your own infrastructure while retaining the full Claude Code workflow.

Featured image for Red Hat OpenShift AI.
Article

Run Model-as-a-Service for multiple LLMs on OpenShift

Vladimir Belousov

Learn how to deploy multiple large language models (LLMs) behind a single OpenAI-compatible endpoint on OpenShift using a Model-as-a-Service (MaaS) approach. This guide demonstrates how to build an intelligent routing infrastructure that dynamically inspects the request payload and directs traffic based on the specified model field, reducing GPU waste and simplifying application logic.

Featured image for Red Hat OpenShift AI.
Article

Hybrid loan-decisioning with OpenShift AI and Vertex AI

Harshil Sabhnani

Discover a practical solution pattern for building a modern financial application that makes loan decisions using multiple machine learning systems deployed across hybrid environments.

Red Hat AI
Article

Configure NVIDIA Blackwell GPUs for Red Hat AI workloads

Erwan Gallen +4

Learn how to enable the NVIDIA RTX PRO 4500 Blackwell Server Edition on Red Hat AI for compact, power-efficient AI deployments. This hardware offers inference performance without adding unnecessary operational complexity for Red Hat AI users.

Jupyter Notebooks on Red Hat OpenShift AI share/feature image
Article

Accelerated expert-parallel distributed tuning in Red Hat OpenShift AI

Karel Suta +4

Discover how to optimize training of MoE models with fms-hf-tuning, an open source tuning library for PyTorch FSDP and Hugging Face libraries. Learn about preprocessing data, throughput and memory efficiency features, distributed training, and expert parallelism. Improve your AI and agentic applications on domain-specific enterprise tasks.

Featured image for agentic AI
Article

Automate AI agents with the Responses API in Llama Stack

Michael Dawson

Learn how the Responses API in Llama Stack automates complex tool calling while maintaining granular control over conversation flow for AI agents. Discover the benefits and implementation details.

Red Hat AI
Article

Estimate GPU memory for LLM fine-tuning with Red Hat AI

Mohib Azam

Learn how to estimate memory requirements for your LLM fine-tuning experiments using Red Hat Training Hub's memory_estimator.py API. This guide covers the memory components, adjusting training setups for specific GPU specifications, and using the memory estimator in your code. Streamline your model fine-tuning process with runtime estimates and automated hyperparameter suggestions.

Featured image for Red Hat OpenShift AI.
Article

Serve and benchmark Prithvi models with vLLM on OpenShift

Michele Gazzetti +3

Learn how to deploy and test an Earth and space model inference service on Red Hat AI Inference Server and Red Hat OpenShift AI. This article includes two self-contained activities, one deploying Prithvi using a traditional Deployment object and another serving the model using KServe and observing Knative scaling.

Featured image for vLLM interference article.
Article

Practical strategies for vLLM performance tuning

Trevor Royer

Optimize vLLM performance with practical tuning tips. Learn how to use GuideLLM for benchmarking, adjust GPU ratios, and maximize KV cache to improve throughput.

Featured image for Red Hat OpenShift AI.
Article

Fine-tune AI pipelines in Red Hat OpenShift AI 3.3

Ana Biazetti +2

Learn how to fine-tune AI pipelines in Red Hat OpenShift AI 3.3. Use Kubeflow Trainer and modular components for reproducible, production-grade model tuning.