Red Hat Enterprise Linux AI

Develop, deploy, and run large language models (LLMs) in individual server environments.

The solution includes Red Hat AI Inference, an inference server that provides fast, consistent, and cost-effective inference at scale. As the engine for agentic AI and Models-as-a-Service work patterns, it provides the operational control to run any model on any accelerator across the hybrid cloud.

Download RHEL AI
A collage of design elements

Breadcrumb

  1. Home
  2. Products
  3. Red Hat Enterprise Linux AI

Introducing an accessible open source AI platform

RHEL AI simplifies deployment across hybrid infrastructure environments by integrating enterprise-ready versions of Granite language models, along with a bootable image of Red Hat Enterprise Linux. This forms a foundation model platform for bringing open source-licensed gen AI models into the enterprise, lowering costs and removing barriers to testing and experimentation. The platform supports tools like vLLM, DeepSpeed, and PyTorch, and leverages containerization with Podman, RHEL Image mode, and scalable deployments using Kubernetes and OpenShift. 

Red Hat AI logo overlaid on top of cube tower

What’s included in RHEL AI

Open source Granite language and code models

Fully supported and indemnified by Red Hat for enterprise use.

View on Red Hat Catalog

Learn More

Red Hat AI Inference

Powered by vLLM, the end-to-end inference stack optimizes token economics and hardware capacity for faster response times. The open source technology increases efficiency without sacrificing performance.

Learn more

Cloud-native scalability

With Image Mode for RHEL, RHEL AI lets you manage your AI platform as a container image, streamlining your approach to scaling.

Acceleration and AI Tooling

Leverage pre-built, bootable RHEL images with PyTorch, hardware acceleration libraries for NVIDIA, Intel and AMD. Integrate your existing workflows with support for vLLM, DeepSpeed and PyTorch.

Red Hat Enterprise Linux AI allows portability across hybrid cloud environments and provides an onramp to AI at scale with Red Hat OpenShift AI, and to additional capabilities with IBM watsonx.ai.

Introducing Red Hat AI Inference

Deploy your preferred models faster and more cost-effectively across the hybrid cloud with Red Hat AI Inference. Its vLLM runtime maximizes inference throughput and minimizes latency. A pre-optimized model repository ensures rapid model serving, while the LLM compressor reduces compute costs without sacrificing accuracy. Experience fast, accurate inference for a wide range of applications.

Red Hat AI Inference is included in Red Hat OpenShift AI and Red Hat Enterprise Linux AI and supported on Red Hat OpenShift and Red Hat Enterprise Linux.

Try Red Hat AI Inference
Bearded person working at a computer

Explore related resources

Red Hat OpenShift AI

A cloud service that gives data scientists and developers a powerful AI/ML...

Image mode for Red Hat Enterprise Linux share and feature image

Use container technologies to build, deploy and manage your operating system...

RHEL platform card

A stable, proven foundation that’s versatile enough for rolling out new...