Red Hat Developer Blog

Article

Apr 05, 2025

Llama 4 herd is here with Day 0 inference support in vLLM

vLLM team at Red Hat

Discover the new Llama 4 Scout and Llama 4 Maverick models from Meta, with...

Article

Apr 05, 2025

Async-GRPO: Open, fast, and performant

Aldo Pareja +1

Discover Async-GRPO, a new library for reinforcement learning tasks that...

Article

Apr 04, 2025

Sculpting subspaces: How we solved continual learning in LLMs

Nikhil Shivakumar Nayak +10

Discover how the adaptive SVD approach enables LLMs to continually learn and...

Article

Apr 03, 2025

How to navigate LLM model names

Trevor Royer

Learning the naming conventions of large language models (LLMs) helps users...

Article

Apr 03, 2025

Simplify AI data integration with RamaLama and RAG

Daniel Walsh

Explore how RamaLama makes it easier to share data with AI models using...

Article

Apr 02, 2025

A practical guide to Llama Stack for Node.js developers

Michael Dawson

Explore how to run tools with Node.js using Llama Stack's completions API,...

Article

Apr 01, 2025

Enable 3.5 times faster vision language models with quantization

Shubhra Pandit +4

Learn how quantized vision-language models enable faster inference, lower...

Article

Featured image for Red Hat OpenShift AI.

Mar 26, 2025

From tuning to serving: How open source powers the LLM life cycle

Junpei Ishikawa

This article demonstrates how to fine-tune LLMs in a distributed environment...

Article

Featured blog image with the following text: vLLM and DeepSeek

Mar 19, 2025

How we optimized vLLM for DeepSeek-R1

Michael Goin +4

Explore inference performance improvements that help vLLM serve DeepSeek AI...

Report a website issue

Red Hat Developer Sandbox

Programming languages & frameworks

System design & architecture

Developer experience

Automated data processing

Platform engineering

Secure development & architectures

E-books

Cheat sheets

Documentation

View all blogs & articles

Llama 4 herd is here with Day 0 inference support in vLLM

Async-GRPO: Open, fast, and performant

Sculpting subspaces: How we solved continual learning in LLMs

How to navigate LLM model names

Simplify AI data integration with RamaLama and RAG

A practical guide to Llama Stack for Node.js developers

Enable 3.5 times faster vision language models with quantization

From tuning to serving: How open source powers the LLM life cycle

How we optimized vLLM for DeepSeek-R1

Featured Authors

Cedric Clyburn

Michael Dawson

Don Schenck

Andrew Azores

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue