Page

Understanding the Devfile

October 14, 2025

Rohan Kumar

Prerequisites:

In this lesson, you will:

How Devfile sets up a small, dedicated container to run an AI model within your cloud development workspace.

What is a devfile?

It looks quite smooth. But behind the scenes, your cloud development environment is assembling various components to ensure seamless interaction with your AI assistant. It configures the large language model (LLM) you’re engaging with via the Continue extension to run in a sidecar container within your cloud workspace pod.

This configuration is in the form of a devfile, an open standard that defines containerized development environments using a YAML-formatted text file. By leveraging a devfile, your system ensures consistency, portability, and streamlined deployment of the AI assistant within your cloud-based environment.

In the devfile, you provide the configuration for your cloud workspace and for the model, as shown in the code snippet:

- name: ramalama
    attributes:
      container-overrides:
        resources:
          limits:
            cpu: 4000m
            memory: 12Gi
          requests:
            cpu: 1000m
            memory: 8Gi
    container:
      image: quay.io/ramalama/ramalama:0.7
      args:
        - "ramalama"
        - "--store"
        - "/models"
        - "serve"
        - "--network=none"
        - "ollama://granite-code:latest"
      mountSources: true
      sourceMapping: /.ramalama
      volumeMounts:
        - name: ramalama-models
          path: /models
      endpoints:
        - exposure: public
          name: ramalamaserve
          protocol: http
          targetPort: 8080
  - name: ramalama-models
    volume:
      size: 5Gi

In the devfile, we added a container for serving the IBM Granite-Code LLM using RamaLama. We used the RamaLama official container image and specified the CPU and memory constraints of the container. RamaLama is set up to use an alternate directory as its model store, which is mounted as a volume. This ensures that models are managed efficiently while maintaining isolation and security within the containerized environment.

The ramalama serve command runs AI models as a REST API, making them accessible on port 8080 for inference requests. To ensure the model operates in a fully air-gapped environment, we explicitly set --network=none, which prevents access from any external network.

Summary

In this learning path, we learned about RamaLama and how you can use this tool in a cloud development environment with the help of Devfile and OpenShift Dev Spaces. We also discussed how you can deploy large language models within your internal infrastructure, ensuring secure development while maintaining full control over your data, free from external dependencies.

If you want to learn more about it, please visit these links:

Report a website issue

Red Hat Developer Sandbox

Programming languages & frameworks

System design & architecture

Developer experience

Automated data processing

Platform engineering

Secure development & architectures

E-books

Cheat sheets

Documentation

How to run AI models in cloud development environments

Understanding the Devfile

Prerequisites:

In this lesson, you will:

What is a devfile?

Summary

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue

How to run AI models in cloud development environments

Path resource: Understanding the Devfile

Prerequisites:

In this lesson, you will:

What is a devfile?

Summary

Platforms

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue