Build more secure, optimized AI supply chains with Fromager

Editor's note

This article was adapted from a post originally published on Medium and is republished here with the author's permission.

In December 2022, a malicious package called torchtriton appeared on PyPI. It exploited a technique where a public package impersonates a private one and silently exfiltrated data from thousands of PyTorch users.

In 2024, the official ultralytics package was compromised, turning a trusted computer vision library into a cryptominer.

For teams operating datacenters with thousands of GPUs running AI workloads, this isn't an acceptable risk. A single compromised dependency puts an entire fleet at risk.

How pip install creates security and compatibility risks

Most Python users install prebuilt binary wheels from PyPI without a second thought. It's fast and convenient, and it usually works. However, these binaries are opaque artifacts. You can't inspect what was compiled into them, what compiler flags were used, or whether the source was tampered with before building.

AI and machine learning (ML) stacks face more challenges with this model. Libraries like PyTorch, vLLM, and TensorFlow have deep dependency trees with native extensions compiled against specific CUDA, ROCm, or CPU instruction sets. Installing the wrong combination leads to subtle application binary interface (ABI) mismatches. These mismatches might not lead to a crash, but they can produce silent numerical errors, performance issues, or sporadic segfaults in production.

As shown in Figure 1, package dependencies form a complex graph that requires a coherent and secure build process.

Fragmented orange shapes from a cloud transition into a structured blue pipeline ending at a shield icon with a lock. — Figure 1: A secure pipeline organizes fragmented software dependencies into a verified and integrated build.

How Fromager helps protect Python dependencies

Fromager is an open source project designed for platform teams who build and distribute Python wheels at scale. Named after the French word for "cheese maker" (continuing Python's long tradition of cheese-themed packaging tools), Fromager takes a different approach to dependency management.

Instead of downloading prebuilt binaries, Fromager rebuilds your entire dependency tree from source. This includes your application's runtime dependencies and the build tools themselves. You can trace every binary in the final output back to inspectable source code.

How Fromager works

Fromager operates in two modes: build and bootstrap.

bootstrap mode discovers your complete dependency graph—including both build-time and runtime—resolves versions, and builds everything from the bottom up. The output is a collection of built wheels, a dependency graph, and a deterministic build order.

build mode uses that build order to reproduce the exact same collection. Same inputs, same outputs.

# Discover and build the full dependency tree
fromager bootstrap -r requirements.txt
# Reproduce the build later
fromager build-sequence -r work-dir/build-order.json

Network isolation

Fromager provides hermetic network-isolated builds to harden your supply chain. By using Linux network namespaces, Fromager cuts off all network access during compilation. Only localhost is reachable.

This prevents a compromised setup.py from downloading additional payloads. A malicious build hook can't exfiltrate secrets.

fromager --network-isolation bootstrap torch

This feature could have prevented the ultralytics attack, where the compromised build step downloaded and executed external code.

Building collections, not packages

Instead of treating dependencies as isolated artifacts, Fromager manages them as a verifiable map called a directed acyclic graph (DAG). It rebuilds complete dependency trees for both build time and runtime using only the source code.

Fromager differs from other packaging tools in one key area. It doesn't build packages individually; instead, it builds them as collections.

The graph.json is complete map showing every piece of software involved, how they connect, and where they originated. This allows you to verify and reproduce the entire build consistently.

Improving integrity through auditable dependencies

This architecture provides supply chain verifiability. You can audit every package, including its source, version, and build tools, throughout the entire dependency tree. There's no uninspectable, prebuilt binary in the chain. This is critical for environments that require software integrity, origin and reproducibility, such as enterprise, government, and security-sensitive deployments.

Building collections also helps with ABI compatibility. Consider the PyTorch ecosystem. torch, torchvision, and torchaudio all share native extensions compiled against the same CUDA toolkit. If you build them separately at different times, there's no guarantee their ABIs will align. Fromager builds them together, in dependency order, ensuring binary compatibility across the entire stack.

Customize everything with plug-ins and overrides

AI/ML stacks are notoriously difficult to build. They require specific compiler flags, patched source code, and hardware-specific optimizations. Fromager's plug-in architecture lets you hook into every stage of the build lifecycle:

Download: Pull source code from private registries or Git repositories.
Prepare: Apply patches before the build.
Build: Customize compiler flags and environment variables.
Post-build: Run validation or inject metadata.

It also has an override system, where a single configuration can target multiple hardware platforms. If you need the same stack built for CUDA, ROCm, and CPU, define variants:

# overrides/settings/torch.yaml
variants:
 cpu:
   env:
     USE_CUDA: "0"
 cuda:
   env:
     CUDA_HOME: "/usr/local/cuda"
 rocm:
   env:
     ROCM_PATH: "/opt/rocm"
Then build each variant with a single flag:
fromager --variant cuda bootstrap -r requirements.txt
fromager --variant rocm bootstrap -r requirements.txt

Taking ownership of your AI supply chain

The AI ecosystem is evolving rapidly. New models, frameworks, and hardware appear constantly. However, speed without security is reckless.

Organizations deploying AI at scale face a choice: trust a supply chain they can't verify, or take ownership of it. Fromager makes the second option practical.

At Red Hat, Fromager is the core wheel-building engine that helps ensure a verifiable, more secure supply chain for Red Hat AI products.

Fromager is open source project and actively developed. If you're building Python wheels for AI/ML workloads, we'd love your feedback and contributions.

GitHub: python-wheel-build/fromager
Documentation: fromager.readthedocs.io

Build more secure, optimized AI supply chains with Fromager

Editor's note

How pip install creates security and compatibility risks

How Fromager helps protect Python dependencies

How Fromager works

Network isolation

Building collections, not packages

Improving integrity through auditable dependencies

Customize everything with plug-ins and overrides

Taking ownership of your AI supply chain

Running AI inference on Rebellions ATOM NPU with Red Hat AI

How we built integration testing for fast-moving AI backend

Testing infrastructure red teaming with abliterated models

Build an enterprise RAG system with OGX

Solutions for SELinux MCS challenges with GitLab runners

Build your AI application with an AI Lab extension in Podman Desktop

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links