Rahul Tuli

Rahul Tuli's contributions

Speculators standardizes speculative decoding for large language models, with a unified Hugging Face format, vLLM integration, and more.

Discover how to deploy compressed, fine-tuned models for efficient inference with the new Axolotl and LLM Compressor integration.

LLM Compressor bridges the gap between model training and efficient deployment via quantization and sparsity, enabling cost-effective, low-latency inference.

Report a website issue

Red Hat Developer Sandbox

Programming languages & frameworks

System design & architecture

Developer experience

Automated data processing

Platform engineering

Secure development & architectures

E-books

Cheat sheets

Documentation

Rahul Tuli

Rahul Tuli's contributions

Speculators: Standardized, production-ready speculative decoding

Axolotl meets LLM Compressor: Fast, sparse, open

LLM Compressor: Optimize LLMs for low-latency deployments

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue