Add automated AI evaluations to your CI/CD pipeline
Learn how to use the EvalHub CLI to automate AI evaluations in your CI/CD pipelines. Install the SDK, configure profiles, and set up a production gate.
Learn how to use the EvalHub CLI to automate AI evaluations in your CI/CD pipelines. Install the SDK, configure profiles, and set up a production gate.
Learn how llm-d routes each inference request to the GPU that already has the relevant data cached, cutting down on time-to-first-token, and doubling throughput without changing hardware. Discover how Red Hat's stack packages this neatly into a single Kubernetes resource.
Learn how to onboard a custom evaluation framework into EvalHub using one class, one method, and a container image. This guide covers the contract, data structures, and a complete minimal adapter.
Headed to WeAreDevelopers World Congress Europe 2026? Visit the Red Hat Developer booth on-site to speak to our expert technologists.
Learn how to read an existing system collection, understand its threshold logic, and build your own collection that encodes your actual measurement strategy with thresholds that mean something.
Speculators v0.5.0 introduces DFlash support, enabling single-pass draft token generation with block diffusion for more efficient speculative decoding workflows. The release also adds unified online and offline training through vLLM’s native hidden states extraction system, improving training flexibility, version stability, and production readiness.
Red Hat and DeepLearning.AI have released a free hands-on course on the full LLM
Learn how to use Red Hat OpenShift AI's reusable components to build modular AI pipelines, speed up development, and focus on what differentiates your applications.
Learn how evaluation-driven development (EDD) turns AI optimization from an art into an engineering discipline with EvalHub.
Learn about LogAn, an open source tool designed to overcome the limitations of using LLMs to analyze massive volumes of production logs.
A Llama Stack-dependent backend, or any rapidly-evolving upstream project faces a version-drift problem. Explore our no-cost solution that provides early warnings.
Learn how an expert red-teamed an infrastructure using Red Hat AI, OpenClaw, and abliterated models on Red Hat OpenShift on IBM Cloud.
Learn how to transform a simple chatbot into an enterprise RAG application by applying metadata filtering, hybrid search, and neural reranking using the OGX framework in Red Hat OpenShift AI.
Learn how to prevent GPU waste and financial loss by implementing just-in-time (JIT) checkpointing with Kubeflow Training SDK on OpenShift AI.
Learn about the five primary structural challenges in enterprise AI evaluation and how EvalHub addresses them with a unified foundation for AI evaluation.
Learn how our team implemented CI/CD pipelines for the it-self-service-agent AI quickstart and the benefits of using CI/CD for agentic systems.
Learn how Red Hat AI can help address the security challenges of AI agents in production, from semantic malware to container escapes.
Scale agentic AI with Red Hat’s trusted software factory. Use Policy as Code and SBOMs to strengthen your development pipeline and manage software provenance.
Learn how Red Hat AI 3.4 uses EvalHub to orchestrate AI evaluations on Kubernetes. Scale frameworks like Garak and LightEval with built-in MLflow tracking.
Learn how Kagenti ADK, an open source toolkit, handles the complexities of managing production AI agents. It aligns with the Linux Foundation's Agent2Agent (A2A) protocol and provides a set of runtime services for easier deployment and operation.
Learn about our team's experience implementing a defense-in-depth safety architecture for AI agents using Llama Stack shields.
This article discusses the benefits of diffusion LLMs, a revolutionary approach to language models that offers a dynamic tradeoff between accuracy and performance. The article covers the architecture, evolution, and real-world statistics of this technology, including examples of open source models like LLaDA 2.X and Mercury 2.
Learn how to use OpenViking context database instead of traditional flat vector storage to provide AI agents with persistent, structured memory.
This article describes how to onboard a project and the results using two tools, CodeCov and CodeRabbit.
Learn how to combine KServe and llm-d to optimize generative AI inference, improve performance, and reduce infrastructure costs. This article demonstrates the integration architecture and provides practical guidance for AI platform teams.