Saša Zelenović
Saša Zelenović's contributions
Article
DeepSeek-V3.2-Exp on vLLM, Day 0: Sparse Attention for long-context inference, ready for experimentation today with Red Hat AI
Saša Zelenović
+3
DeepSeek-V3.2-Exp offers major long-context efficiency via vLLM on Day 0, deploying easily on the latest leading hardware and Red Hat AI platforms.
Article
vLLM with torch.compile: Efficient LLM inference on PyTorch
Luka Govedič
+5
Learn how to optimize PyTorch code with minimal effort using torch.compile, a just-in-time compiler that generates optimized kernels automatically.
Article
Ollama or vLLM? How to choose the right LLM serving tool for your use case
Addie Stevens
+2
Ollama makes it easy for developers to get started with local model experimentation, while vLLM provides a path to reliable, efficient, and scalable deployment.

Article
DeepSeek-V3.2-Exp on vLLM, Day 0: Sparse Attention for long-context inference, ready for experimentation today with Red Hat AI
Saša Zelenović
+3
DeepSeek-V3.2-Exp offers major long-context efficiency via vLLM on Day 0, deploying easily on the latest leading hardware and Red Hat AI platforms.

Article
vLLM with torch.compile: Efficient LLM inference on PyTorch
Luka Govedič
+5
Learn how to optimize PyTorch code with minimal effort using torch.compile, a just-in-time compiler that generates optimized kernels automatically.

Article
Ollama or vLLM? How to choose the right LLM serving tool for your use case
Addie Stevens
+2
Ollama makes it easy for developers to get started with local model experimentation, while vLLM provides a path to reliable, efficient, and scalable deployment.