Red Hat AI

Gain detailed insights into vLLM deployments on OpenShift AI. Learn to build dashboards with Dynatrace and OpenTelemetry to enable reliable LLM performance.

Introduction to OpenShift AI

Alex Krikos +2

May 21, 2025

Learn how to use Red Hat OpenShift AI to quickly develop, train, and deploy

Explore the complete machine learning operations (MLOps) pipeline utilizing Red

llm-d delivers Kubernetes-native distributed inference with advanced optimizations, reducing latency and maximizing throughput.

LLM Semantic Router uses semantic understanding and caching to boost performance, cut costs, and enable efficient inference with llm-d.

Optimize model inference and reduce costs with model compression techniques like quantization and pruning with LLM Compressor on Red Hat OpenShift AI.

Getting reasoning models enterprise-ready

Abhishek Bhandwaldar +2

May 20, 2025

Learn how to use synthetic data generation (SDG) and fine-tuning in Red Hat AI to customize reasoning models for your enterprise workflows.

Learn how to deploy a trained model with Red Hat OpenShift AI and use its

Explore how to use large language models (LLMs) with Node.js by observing Ollama

More Essential AI tutorials for Node.js Developers

How to run a fraud detection AI model on RHEL CVMs

Emanuele Giuseppe Esposito +2

May 15, 2025

Learn how to run a fraud detection AI model using confidential virtual machines on RHEL running in the Microsoft Azure public cloud.

Configure your Red Hat Enterprise Linux AI machine, download, serve, and

vLLM empowers macOS and iOS developers to build powerful AI-driven applications by providing a robust and optimized engine for running large language models.

PowerUP 2025 is the week of May 19th. It's held in Anaheim, California this year

Learn how to use pipelines in OpenShift AI to automate the full AI/ML lifecycle on a single-node OpenShift instance.

Jupyter Notebook works with OpenShift AI to interactively classify images. In

LLM Compressor bridges the gap between model training and efficient deployment via quantization and sparsity, enabling cost-effective, low-latency inference.

Accelerate the development and deployment of enterprise AI solutions across the

Learn how the dynamic accelerator slicer operator improves GPU resource management in OpenShift by dynamically adjusting allocation based on workload needs.

This tutorial shows you how to use the Llama Stack API to implement retrieval-augmented generation for an AI application built with Node.js.

Learn about the Red Hat OpenShift AI model fine-tuning stack and how to run performance and scale validation.

Learn how NVIDIA GPUDirect RDMA over Ethernet enhances distributed model training performance and reduces communication bottlenecks in Red Hat OpenShift AI.

Learn how the DeepSeek training process used reinforcement learning algorithms to generate human-like text and improve overall performance.

Explore performance and usability improvements in vLLM 0.8.1 on OpenShift, including crucial architectural overhauls and multimodal inference optimizations.

Discover a new combinatorial approach to decoding AI’s hidden logic, exploring how neural networks truly compute and reason."

Red Hat AI

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue