Robert Shaw
Robert Shaw's contributions

Article
How we optimized vLLM for DeepSeek-R1
Michael Goin
+4
Explore inference performance improvements that help vLLM serve DeepSeek AI models more efficiently in this technical deep dive.

Article
LLM Compressor is here: Faster inference with vLLM
Robert Shaw
+3
Discover LLM Compressor, a unified library for creating accurate compressed models for cheaper and faster inference with vLLM.

Article
Sparse fine-tuning for accelerating large language models with DeepSparse
Robert Shaw
+1
Sparse fine-tuning in combination with sparsity-aware inference software, like DeepSparse, unlocks ubiquitous CPU hardware as a deployment target for LLM inference.

Article
SparseGPT: Remove 100 billion parameters for free
Robert Shaw
+1
Compress large language models (LLMs) with SparseGPT to make your machine learning inference fast and efficient. Prune in one-shot with minimal accuracy loss.