Kyle Sayers
Kyle Sayers's contributions
Article
LLM Compressor 0.7.0 release recap
Dipika Sikka
+3
LLM Compressor 0.7.0 brings Hadamard transforms for better accuracy, mixed-precision FP4/FP8, and calibration-free block quantization for efficient compression.
Article
LLM Compressor: Optimize LLMs for low-latency deployments
Kyle Sayers
+3
LLM Compressor bridges the gap between model training and efficient deployment via quantization and sparsity, enabling cost-effective, low-latency inference.
Article
Multimodal model quantization support through LLM Compressor
Kyle Sayers
+3
Explore multimodal model quantization in LLM Compressor, a unified library for optimizing models for deployment with vLLM.

Article
LLM Compressor 0.7.0 release recap
Dipika Sikka
+3
LLM Compressor 0.7.0 brings Hadamard transforms for better accuracy, mixed-precision FP4/FP8, and calibration-free block quantization for efficient compression.

Article
LLM Compressor: Optimize LLMs for low-latency deployments
Kyle Sayers
+3
LLM Compressor bridges the gap between model training and efficient deployment via quantization and sparsity, enabling cost-effective, low-latency inference.

Article
Multimodal model quantization support through LLM Compressor
Kyle Sayers
+3
Explore multimodal model quantization in LLM Compressor, a unified library for optimizing models for deployment with vLLM.