How to run a Red Hat-powered local AI audio transcription

In my opinion, one of the best use cases of AI is audio transcription. As a wordsmith by nature, I'm frequently disappointed by generative AI, but I find AI inference extremely useful. I consider it the missing component between the input you provide and the input you actually mean to provide. This is useful for speech recordings, where background noise, microphone dynamics, or poor compression can distort words. AI inference is able to infer the most probable meaning of what otherwise is difficult to hear.

However, audio transcription as a service presents a potential privacy risk since it requires sending your audio file to an external server. I will demonstrate how, with just a few Python and Git commands, you can easily run a local audio transcription application, powered by an open source training model from Red Hat AI. Once installed, you can use it without an Internet connection because it's entirely local, and your audio never leaves your computer.

How to set up and run the application

Open a terminal, and follow along.

First install uv. The uv application is a Python package manager like the Python pip module, but with many additional features.

Download the install script as follows:

$ curl -LsSf https://astral.sh/uv/install.sh -O

Review the script and then run it:

$ bash ./install.sh

Create a virtual environment

Create a Python virtual environment for your work.

$ uv venv --seed whisper-example

Next install the whisper application:

$ uv pip install whisper

Install the HuggingFace repository tool to make it easy to obtain new AI models.

uv tool install hf

Download the model

Red Hat has tested and validated the RedHatAI/whisper-large-v3-turbo-FP8-dynamic model for performance and accuracy. This is one of the models you can run on the Red Hat AI Inference Server, which provides a supported open source solution that allows you to deploy your AI models on a variety of hardware and AI accelerators to match your specific infrastructure needs.

You can download the model from its HuggingFace repository using the hf tool:

$ hf download RedHatAI/whisper-large-v3-turbo-FP8-dynamic

The model size is about 1 GB. When the download is complete, you will receive the location of the model as follows:

Download complete: : 0.00B [00:00, ?B/s]
/home/tux/.cache/huggingface/hub/models--RedHatAI--whisper-large-v3-turbo-FP8-dynamic/snapshots/e72a6dca29d039a5c9ea13e622e496ca61e85c34

Take note of the model location for the next step.

Transcribe the audio

Now that you have installed Whisper and a Red Hat AI model, you can transcribe an audio file. Keep in mind that you must activate your Python virtual environment before using this install of Whisper. Unless you've closed your terminal window during this install and setup process, the virtual environment is still active.

Assuming you have an audio recording called example.flac, you can transcribe it by providing Whisper the base path to the Red Hat model and the audio file.

$ whisper --model_dir ~/.cache/huggingface/hub/models--RedHatAI--whisper-large-v3-turbo-FP8-dynamic ~/example.flac

Example output:

Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Detected language: English

[00:00.000 --> 00:10.200] This is a test of the Whisper and Red Hat AI combination running on my Red Hat Enterprise Linux laptop.

Next steps

Open source AI allows you to keep your data and computing local. Using familiar tools on Red Hat Enterprise Linux and Fedora Linux, you can implement your own in-house audio transcription service. If you're a Python programmer, you can even use the Red Hat models with your applications.

Visit Red Hat AI on HuggingFace for more information about the available models. Check out the Red Hat AI Inference Server page to learn how you can deploy AI-powered applications.

How to run a Red Hat-powered local AI audio transcription

How to set up and run the application

Create a virtual environment

Download the model

Transcribe the audio

Next steps

Understanding Argo CD ApplicationSets - Parameters (Part 1)

Smarter data generation for faster Speculator training

5 anti-patterns that cause Kubernetes operator vulnerabilities

Track model usage with the OpenShift AI 3.4 usage dashboard

Red Hat build of Quarkus 3.33: Stability and performance advancements for enterprise Java

Open source AI for developers

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links