Addie Stevens

Addie Stevens's contributions

Use the open source SDG Hub to quickly create custom synthetic data pipelines. Train and evaluate your models faster and more efficiently.

Article

Introduction to distributed inference with llm-d

Christopher Nuland +1

November 21, 2025

Learn how the llm-d project is revolutionizing LLM inference by enabling distributed, efficient, and scalable model serving across Kubernetes clusters.

Post-training methods for language models

Mustafa Eyceoz +1

November 4, 2025

Dive into LLM post-training methods, from supervised fine-tuning and continual learning to parameter-efficient and reinforcement learning approaches.

Learn how to optimize PyTorch code with minimal effort using torch.compile, a just-in-time compiler that generates optimized kernels automatically.

Ollama makes it easy for developers to get started with local model experimentation, while vLLM provides a path to reliable, efficient, and scalable deployment.

Learn how to evaluate the performance of your LLM deployments with the open source GuideLLM toolkit to optimize cost, reliability, and user experience.

Learn how to control the output of vLLM's AI responses with structured outputs. Discover how to define choice lists, JSON schemas, regex, and more.

LLM Compressor bridges the gap between model training and efficient deployment via quantization and sparsity, enabling cost-effective, low-latency inference.

Report a website issue

Red Hat Developer Sandbox

Programming languages & frameworks

System design & architecture

Developer experience

Automated data processing

Platform engineering

Secure development & architectures

E-books

Cheat sheets

Documentation

Addie Stevens

Addie Stevens's contributions

Building domain-specific LLMs with synthetic data and SDG Hub

Introduction to distributed inference with llm-d

Post-training methods for language models

vLLM with torch.compile: Efficient LLM inference on PyTorch

Ollama or vLLM? How to choose the right LLM serving tool for your use case

GuideLLM: Evaluate LLM deployments for real-world inference

Structured outputs in vLLM: Guiding AI responses

LLM Compressor: Optimize LLMs for low-latency deployments

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue