Michael Goin

Michael Goin's contributions

Featured image for vLLM interference article.
Article

5 steps to triage vLLM performance

David Whyte-Gray +3

Learn how to improve the performance of your vLLM deployments with a diagnostic workflow that isolates latency issues and server saturation. Discover the key metrics to monitor and techniques to alleviate memory pressure.

Featured blog image with the following text: vLLM and DeepSeek
Article

How we optimized vLLM for DeepSeek-R1

Michael Goin +4

Explore inference performance improvements that help vLLM serve DeepSeek AI models more efficiently in this technical deep dive.

Featured image for Distributed inference with vLLM.
Article

Distributed inference with vLLM

Michael Goin

Explore how distributed inference works within vLLM in this recap of Neural Magic's vLLM Office Hours with Michael Goin and Murali Andoorveedu, a vLLM committer from CentML.