Eval-driven development: Build and evaluate reliable AI agents
Learn how to build reliable AI agents with our 8-stage evaluation framework. We explore DeepEval, multi-turn testing, and CI/CD integration for Red Hat AI.
Learn how to build reliable AI agents with our 8-stage evaluation framework. We explore DeepEval, multi-turn testing, and CI/CD integration for Red Hat AI.
Discover a practical solution pattern for building a modern financial application that makes loan decisions using multiple machine learning systems deployed across hybrid environments.
Headed to Devoxx UK 2026? Visit the Red Hat Developer booth on-site to speak to our expert technologists.
Headed to Devoxx France 2026? Visit the Red Hat Developer booth on-site to speak to our expert technologists.
LLM Compressor v0.10 introduces Distributed Data Parallel (DDP) for faster compression, memory management, and advanced quantization formats. Make model compression workflows more efficient for large language models.
Learn how to enable the NVIDIA RTX PRO 4500 Blackwell Server Edition on Red Hat AI for compact, power-efficient AI deployments. This hardware offers inference performance without adding unnecessary operational complexity for Red Hat AI users.
Learn how to streamline Red Hat OpenShift AI dependency management using Kustomize and GitOps. Use the odh-gitops repo to automate setup via CLI or Argo CD.
This video demonstrates how to deploy Red Hat OpenShift AI dependencies using the odh-gitops repository with Kustomize and the command line.
This video demonstrates how to deploy Red Hat OpenShift AI dependencies using Argo CD and the odh-gitops repository.
Learn how to build agentic AI workflows using cicaddy and MCP servers directly in your existing CI pipeline.
Discover how to optimize training of MoE models with fms-hf-tuning, an open source tuning library for PyTorch FSDP and Hugging Face libraries. Learn about preprocessing data, throughput and memory efficiency features, distributed training, and expert parallelism. Improve your AI and agentic applications on domain-specific enterprise tasks.
Learn how PatchPatrol, an AI-powered code review tool, helps enterprise development teams maintain high quality and security standards on Red Hat OpenShift.
Learn how to manage the security threats and access controls associated with adopting the new Agent Skills functionality.
Learn how to run high-performance computing workloads managed by Slurm within a containerized OpenShift environment using the Slinky operator.
Learn how to improve the performance of your vLLM deployments with a diagnostic workflow that isolates latency issues and server saturation. Discover the key metrics to monitor and techniques to alleviate memory pressure.
Learn how the Responses API in Llama Stack automates complex tool calling while maintaining granular control over conversation flow for AI agents. Discover the benefits and implementation details.
Learn the many ways you can interact with GPU-hosted large language models (LLMs
Learn how to run OpenAI's Whisper model through vLLM on Apple Silicon, giving you an OpenAI-compatible endpoint on localhost. Then, discover how to take this architecture into production using Red Hat AI Inference Server.
Learn how to create a baseline RAG system using Apache Camel, PostgreSQL, and pgvector. This implementation demonstrates a 'boring' approach to RAG, making it easy to understand and debug. Discover the anatomy of a RAG pipeline, including indexing, retrieval, and answering.
Discover how I used an AI assistant to develop a production-grade Ansible Playbook to audit RHEL versions across a fleet of servers, generating a clean report.
Learn how to estimate memory requirements for your LLM fine-tuning experiments using Red Hat Training Hub's memory_estimator.py API. This guide covers the memory components, adjusting training setups for specific GPU specifications, and using the memory estimator in your code. Streamline your model fine-tuning process with runtime estimates and automated hyperparameter suggestions.
Learn how to deploy and test an Earth and space model inference service on Red Hat AI Inference Server and Red Hat OpenShift AI. This article includes two self-contained activities, one deploying Prithvi using a traditional Deployment object and another serving the model using KServe and observing Knative scaling.
Understand the PyTorch autograd engine internals to debug gradient flows. Learn about computational graphs, saved tensors, and performance optimization techniques.
Optimize vLLM performance with practical tuning tips. Learn how to use GuideLLM for benchmarking, adjust GPU ratios, and maximize KV cache to improve throughput.
Learn how to fine-tune AI pipelines in Red Hat OpenShift AI 3.3. Use Kubeflow Trainer and modular components for reproducible, production-grade model tuning.