Artificial intelligence

Learn how to implement identity-based tool filtering, OAuth2 Token Exchange, and HashiCorp Vault integration for the MCP Gateway.

Get a step-by-step guide to integrating a custom AI service with Red Hat Ansible Lightspeed.

Automate amazon.ai workflows with Ansible: Deploy Bedrock agents, generate personalized content, and monitor resources with DevOps Guru for auditability.

Learn how to migrate from Llama Stack’s deprecated Agent APIs to the modern, OpenAI-compatible Responses API without rebuilding from scratch.

Most log lines are noise. Learn how semantic anomaly detection filters out repetitive patterns—even repetitive errors—to surface the genuinely unusual events.

Integrating AutoRound into LLM Compressor delivers higher accuracy for low bit-width quantization, lightweight tuning, and compressed-tensor compatibility.

Optimize AI scheduling. Discover 3 workflows to automate RayCluster lifecycles using KubeRay and Kueue on Red Hat OpenShift AI 3.

Run the latest Mistral Large 3 and Ministral 3 models on vLLM with Red Hat AI, providing day 0 access for immediate experimentation and deployment.

Learn how to optimize AI inference costs with AWS Inferentia and Trainium chips on Red Hat OpenShift using the AWS Neuron Operator.

Use SDG Hub to generate high-quality synthetic data for your AI models. This guide provides a full, copy-pasteable Jupyter Notebook for practitioners.

This is a guide for setting up and using Red Hat Lightspeed Model Context Protocol to enable natural language interaction.

How to enable NVIDIA GPU acceleration in OpenShift Local

Alexander Barbosa Ayala +1

November 27, 2025

Learn how to share an NVIDIA GPU with an OpenShift Local instance to run containerized workloads that require GPU acceleration without a dedicated server.

This performance analysis compares KServe's SLO-driven KEDA autoscaling approach against Knative's concurrency-based autoscaling for vLLM inference.

Learn how to deploy and manage Models-as-a-Service (MaaS) in Red Hat OpenShift AI, including rate limiting for resource protection.

Use the open source SDG Hub to quickly create custom synthetic data pipelines. Train and evaluate your models faster and more efficiently.

Learn how we built a simple, rules-based algorithm to detect oversaturation in LLM performance benchmarks, reducing costs by more than a factor of 2.

Article

Introduction to distributed inference with llm-d

Christopher Nuland +1

November 21, 2025

Learn how the llm-d project is revolutionizing LLM inference by enabling distributed, efficient, and scalable model serving across Kubernetes clusters.

This in-depth guide helps you integrate generative AI, LLMs, and machine learning into your existing Java enterprise ecosystem. Download the e-book at no cost.

Learn how we built an algorithm to detect oversaturation in large language model (LLM) benchmarking, saving GPU minutes and reducing costs.

Simplify LLM post-training with the Training Hub library, which provides a common, pythonic interface for running language model post-training algorithms.

Speculators standardizes speculative decoding for large language models, with a unified Hugging Face format, vLLM integration, and more.

Learn why prompt engineering is the most critical and accessible method for customizing large language models.

Oversaturation in LLM benchmarking can lead to wasted machine time and skewed performance metrics. Find out how one Red Hat team tackled the challenge.

Learn how to automatically transfer AI model metadata managed by OpenShift AI into Red Hat Developer Hub’s Software Catalog.

Learn how to install and use new MCP plug-ins for Red Hat Developer Hub that provide tools for MCP clients to interact with it.

Artificial intelligence

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue