Artificial intelligence

Featured image for Red Hat OpenShift AI.
Article

Hybrid loan-decisioning with OpenShift AI and Vertex AI

Harshil Sabhnani

Discover a practical solution pattern for building a modern financial application that makes loan decisions using multiple machine learning systems deployed across hybrid environments.

Event

Red Hat at Devoxx UK 2026

Headed to Devoxx UK 2026? Visit the Red Hat Developer booth on-site to speak to our expert technologists.

Event

Red Hat at Devoxx France 2026

Headed to Devoxx France 2026? Visit the Red Hat Developer booth on-site to speak to our expert technologists.

LLM Compressor v0.10.0 is here
Article

LLM Compressor v0.10: Faster compression with distributed GPTQ

Kyle Sayers +2

LLM Compressor v0.10 introduces Distributed Data Parallel (DDP) for faster compression, memory management, and advanced quantization formats. Make model compression workflows more efficient for large language models.

Red Hat AI
Article

Configure NVIDIA Blackwell GPUs for Red Hat AI workloads

Erwan Gallen +4

Learn how to enable the NVIDIA RTX PRO 4500 Blackwell Server Edition on Red Hat AI for compact, power-efficient AI deployments. This hardware offers inference performance without adding unnecessary operational complexity for Red Hat AI users.

Jupyter Notebooks on Red Hat OpenShift AI share/feature image
Article

Accelerated expert-parallel distributed tuning in Red Hat OpenShift AI

Karel Suta +4

Discover how to optimize training of MoE models with fms-hf-tuning, an open source tuning library for PyTorch FSDP and Hugging Face libraries. Learn about preprocessing data, throughput and memory efficiency features, distributed training, and expert parallelism. Improve your AI and agentic applications on domain-specific enterprise tasks.

Featured image for vLLM interference article.
Article

5 steps to triage vLLM performance

David Whyte-Gray +3

Learn how to improve the performance of your vLLM deployments with a diagnostic workflow that isolates latency issues and server saturation. Discover the key metrics to monitor and techniques to alleviate memory pressure.

Featured image for agentic AI
Article

Automate AI agents with the Responses API in Llama Stack

Michael Dawson

Learn how the Responses API in Llama Stack automates complex tool calling while maintaining granular control over conversation flow for AI agents. Discover the benefits and implementation details.

camel rag
Article

Boring RAG: When similarity is just a SQL query

Ivo Bek

Learn how to create a baseline RAG system using Apache Camel, PostgreSQL, and pgvector. This implementation demonstrates a 'boring' approach to RAG, making it easy to understand and debug. Discover the anatomy of a RAG pipeline, including indexing, retrieval, and answering.

Red Hat AI
Article

Estimate GPU memory for LLM fine-tuning with Red Hat AI

Mohib Azam

Learn how to estimate memory requirements for your LLM fine-tuning experiments using Red Hat Training Hub's memory_estimator.py API. This guide covers the memory components, adjusting training setups for specific GPU specifications, and using the memory estimator in your code. Streamline your model fine-tuning process with runtime estimates and automated hyperparameter suggestions.

Featured image for Red Hat OpenShift AI.
Article

Serve and benchmark Prithvi models with vLLM on OpenShift

Michele Gazzetti +3

Learn how to deploy and test an Earth and space model inference service on Red Hat AI Inference Server and Red Hat OpenShift AI. This article includes two self-contained activities, one deploying Prithvi using a traditional Deployment object and another serving the model using KServe and observing Knative scaling.

ai-ml
Article

Optimize PyTorch training with the autograd engine

Vishal Goyal

Understand the PyTorch autograd engine internals to debug gradient flows. Learn about computational graphs, saved tensors, and performance optimization techniques.

Featured image for vLLM interference article.
Article

Practical strategies for vLLM performance tuning

Trevor Royer

Optimize vLLM performance with practical tuning tips. Learn how to use GuideLLM for benchmarking, adjust GPU ratios, and maximize KV cache to improve throughput.

Featured image for Red Hat OpenShift AI.
Article

Fine-tune AI pipelines in Red Hat OpenShift AI 3.3

Ana Biazetti +2

Learn how to fine-tune AI pipelines in Red Hat OpenShift AI 3.3. Use Kubeflow Trainer and modular components for reproducible, production-grade model tuning.