Page
Understanding the Devfile

Prerequisites:
In this lesson, you will:
- How Devfile sets up a small, dedicated container to run an AI model within your cloud development workspace.
What is a devfile?
It looks quite smooth. But behind the scenes, your cloud development environment is assembling various components to ensure seamless interaction with your AI assistant. It configures the large language model (LLM) you’re engaging with via the Continue extension to run in a sidecar container within your cloud workspace pod.
This configuration is in the form of a devfile, an open standard that defines containerized development environments using a YAML-formatted text file. By leveraging a devfile, your system ensures consistency, portability, and streamlined deployment of the AI assistant within your cloud-based environment.
In the devfile, you provide the configuration for your cloud workspace and for the model, as shown in the code snippet:
- name: ramalama
attributes:
container-overrides:
resources:
limits:
cpu: 4000m
memory: 12Gi
requests:
cpu: 1000m
memory: 8Gi
container:
image: quay.io/ramalama/ramalama:0.7
args:
- "ramalama"
- "--store"
- "/models"
- "serve"
- "--network=none"
- "ollama://granite-code:latest"
mountSources: true
sourceMapping: /.ramalama
volumeMounts:
- name: ramalama-models
path: /models
endpoints:
- exposure: public
name: ramalamaserve
protocol: http
targetPort: 8080
- name: ramalama-models
volume:
size: 5Gi
In the devfile, we added a container for serving the IBM Granite-Code LLM using RamaLama. We used the RamaLama official container image and specified the CPU and memory constraints of the container. RamaLama is set up to use an alternate directory as its model store, which is mounted as a volume. This ensures that models are managed efficiently while maintaining isolation and security within the containerized environment.
The ramalama serve
command runs AI models as a REST API, making them accessible on port 8080 for inference requests. To ensure the model operates in a fully air-gapped environment, we explicitly set --network=none
, which prevents access from any external network.
Summary
In this learning path, we learned about RamaLama and how you can use this tool in a cloud development environment with the help of Devfile and OpenShift Dev Spaces. We also discussed how you can deploy large language models within your internal infrastructure, ensuring secure development while maintaining full control over your data, free from external dependencies.
If you want to learn more about it, please visit these links: