Carlos Condado

Carlos Condado's contributions

Learn how to run OpenAI's Whisper model through vLLM on Apple Silicon, giving you an OpenAI-compatible endpoint on localhost. Then, discover how to take this architecture into production using Red Hat AI Inference Server.

Ollama makes it easy for developers to get started with local model experimentation, while vLLM provides a path to reliable, efficient, and scalable deployment.

Carlos Condado

Carlos Condado's contributions

From local prototype to enterprise production: Private speech transcription with Whisper and Red Hat AI

Ollama or vLLM? How to choose the right LLM serving tool for your use case

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links