Dipika Sikka

Dipika Sikka's contributions

Learn about NVFP4, a 4-bit floating-point format for high-performance inference on modern GPUs that can deliver near-baseline accuracy at large scale.

Explore the latest release of LLM Compressor, featuring attention quantization, MXFP4 support, AutoRound quantization modifier, and more.

Run the latest Mistral Large 3 and Ministral 3 models on vLLM with Red Hat AI, providing day 0 access for immediate experimentation and deployment.

Speculators standardizes speculative decoding for large language models, with a unified Hugging Face format, vLLM integration, and more.

The LLM Compressor 0.8.0 release introduces quantization workflow enhancements, extended support for Qwen3 models, and improved accuracy recovery.

LLM Compressor 0.7.0 release recap

Dipika Sikka +3

August 25, 2025

LLM Compressor 0.7.0 brings Hadamard transforms for better accuracy, mixed-precision FP4/FP8, and calibration-free block quantization for efficient compression.

Discover how to deploy compressed, fine-tuned models for efficient inference with the new Axolotl and LLM Compressor integration.

Optimize model inference and reduce costs with model compression techniques like quantization and pruning with LLM Compressor on Red Hat OpenShift AI.

Report a website issue

Red Hat Developer Sandbox

Programming languages & frameworks

System design & architecture

Developer experience

Automated data processing

Platform engineering

Secure development & architectures

E-books

Cheat sheets

Documentation

Dipika Sikka

Dipika Sikka's contributions

Accelerating large language models with NVFP4 quantization

LLM Compressor 0.9.0: Attention quantization, MXFP4 support, and more

Run Mistral Large 3 & Ministral 3 on vLLM with Red Hat AI on Day 0: A step-by-step guide

Speculators: Standardized, production-ready speculative decoding

LLM Compressor 0.8.0: Extended support for Qwen3 and more

LLM Compressor 0.7.0 release recap

Axolotl meets LLM Compressor: Fast, sparse, open

Optimize LLMs with LLM Compressor in Red Hat OpenShift AI

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue