Addie Stevens
Addie Stevens's contributions
Article
LLM Compressor: Optimize LLMs for low-latency deployments
Kyle Sayers
+3
LLM Compressor bridges the gap between model training and efficient deployment via quantization and sparsity, enabling cost-effective, low-latency inference.
Article
vLLM V1: Accelerating multimodal inference for large language models
Michael Goin
+3
Explore how vLLM's new multimodal AI inference capabilities enhance performance, scalability, and flexibility across diverse hardware platforms.
Article
LLM Compressor: Optimize LLMs for low-latency deployments
Kyle Sayers
+3
LLM Compressor bridges the gap between model training and efficient deployment via quantization and sparsity, enabling cost-effective, low-latency inference.
Article
vLLM V1: Accelerating multimodal inference for large language models
Michael Goin
+3
Explore how vLLM's new multimodal AI inference capabilities enhance performance, scalability, and flexibility across diverse hardware platforms.