Artificial intelligence

A stylized illustration representing an artificial neural network, set against a dark purple background within a slightly rounded, darker purple square icon shape. The neural network consists of multiple layers of interconnected nodes, depicted as glossy, spherical red orbs. Lines connect these red orbs, forming a complex web. White arrow shapes extend horizontally from the left side, pointing towards the network, suggesting input or data flowing into the system.
Article

How speculative decoding delivers faster LLM inference

Sawyer Bowerman

Learn how speculative decoding can improve the performance of large language models (LLMs) in production by using a small, fast model to generate tokens speculatively and a large model to verify them.

Red Hat AI
Article

Bring your own evaluation framework to EvalHub

William Caban Babilonia +2

Learn how to onboard a custom evaluation framework into EvalHub using one class, one method, and a container image. This guide covers the contract, data structures, and a complete minimal adapter.

Featured image for Red Hat OpenShift AI.
Article

Integrate OpenShift AI and PG Airman MCP Server

Peter Samouelian +2

Learn about the Data Governance Copilot, designed to make PostgreSQL databases accessible to non-technical users via agentic, natural language interaction while maintaining data compliance.

Featured image for Red Hat OpenShift AI.
Article

Build a local voice agent with Red Hat OpenShift AI

Mike Hepburn

Learn how to create a functional Red Hat pizza shop voice agent using Red Hat OpenShift AI, focusing on practical architecture choices and implementation lessons learned along the way.

Red Hat OpenShift feature image
Article

Gang autoscaling on OpenShift with Kueue and ProvisionRequest

Kevin Hannon +1

Learn how to implement true gang autoscaling on OpenShift using Red Hat build of Kueue and ProvisionRequest API. This approach ensures efficient and reliable scheduling of high-performance workloads like AI/ML training, HPC simulations, or large data processing.

Red Hat AI
Article

Understanding evaluation collections in EvalHub

William Caban Babilonia +2

Learn how to read an existing system collection, understand its threshold logic, and build your own collection that encodes your actual measurement strategy with thresholds that mean something.

Featured image for vLLM interference article.
Article

Speculators v0.5.0: DFlash support and online training

Helen Zhao +2

Speculators v0.5.0 introduces DFlash support, enabling single-pass draft token generation with block diffusion for more efficient speculative decoding workflows. The release also adds unified online and offline training through vLLM’s native hidden states extraction system, improving training flexibility, version stability, and production readiness.

ai-ml
Article

Evaluation-driven development with EvalHub

William Caban Babilonia +1

Learn how evaluation-driven development (EDD) turns AI optimization from an art into an engineering discipline with EvalHub.

A stylized illustration representing an artificial neural network, set against a dark purple background within a slightly rounded, darker purple square icon shape. The neural network consists of multiple layers of interconnected nodes, depicted as glossy, spherical red orbs. Lines connect these red orbs, forming a complex web. White arrow shapes extend horizontally from the left side, pointing towards the network, suggesting input or data flowing into the system.
Article

Running AI inference on Rebellions ATOM NPU with Red Hat AI

Erwan Gallen +2

Learn how to deploy and serve large language models (LLM) on Rebellions ATOM NPUs using Red Hat OpenShift AI and a certified vLLM container image on the Red Hat AI Inference Server. This post walks through the steps to set up the joint solution between Red Hat and Rebellions, including installing the Node Feature Discovery operator, the Rebellions NPU operator, creating the ATOM hardware profile in OpenShift AI, and creating the vLLM RBLN ServingRuntime.

Featured image for Red Hat OpenShift AI.
Article

Build an enterprise RAG system with OGX

Abdelhamid Soliman

Learn how to transform a simple chatbot into an enterprise RAG application by applying metadata filtering, hybrid search, and neural reranking using the OGX framework in Red Hat OpenShift AI.