How to run AI models in cloud development environments

This learning path explores running AI models, specifically Large Language Models (LLMs), in cloud development environments to enhance developer efficiency and data security. It introduces RamaLama, an open-source tool launched in mid-2024, designed to simplify AI workflows by integrating with container technologies.

Access the Developer Sandbox

Prerequisites:

In this lesson, you will:

  • Create a workspace in Red Hat OpenShift Dev Spaces
  • Use the Continue extension to automatically connect to the IBM Granite Large Language Model (LLM)

For this learning path, we will be using OpenShift Dev Spaces to run an AI model in a cloud development environment. It offers ready-to-use development environments based on containers in Red Hat OpenShift.

With OpenShift Dev Spaces, all you need is a device with a web browser to write, build, test, and run code directly on OpenShift. Both developers and administrators can customize their Dev Spaces environments as well as other web-based Individual Development Environments (IDEs) to suit their workflows.

If you want to know more about OpenShift Dev Spaces, read the following articles:


Deploy Granite

To deploy IBM Granite as a private AI coding assistant in OpenShift Dev Spaces using the Continue extension, follow these steps:

  1. Go to the Developer Sandbox, which requires creating a Red Hat account. Once you’ve created an account, you should be able to access OpenShift Dev Spaces.
  2. To access OpenShift Dev Spaces, visit your user dashboard (Figure 1).

    The Red Hat OpenShift Dev Spaces user dashboard, showing options to create and manage workspaces.
    Figure 1. Red Hat OpenShift Dev Spaces user dashboard.
  3. In the user dashboard, go to the Create Workspace tab and enter the repository URL for this activity, https://github.com/redhat-developer-demos/cde-ramalama-continue, then select Create & Open to proceed (Figure 2).

    Git repository URL pasted in Import from Git text box in Red Hat OpenShift Dev Spaces user dashboard in order to import Git repository to launch a cloud development environment.
    Figure 2. Starting the cloud development environment from the GitHub URL.

Note

As the workspace starts up, you will be prompted to grant authorization to the GitHub OAuth app.

  1. After the workspace initializes, you will be prompted to confirm whether you trust the authors of the files within it. To proceed, click Yes, I trust the authors (Figure 3).

    A pop-up asking the user to confirm trust in the authors of the files.
    Figure 3. Visual Studio Code - Open Source (Code - OSS) warning pop-up.
  2. Your workspace will be automatically configured to install the Continue extension on startup. It is the extension we will be using to connect to the LLM running in our cloud workspace environment (Figure 4).

    Editor interface highlighting the Continue extension icon on the left sidebar, indicating its role in connecting to AI models.
    Figure 4. The Continue extension enables connecting to AI models.
  3. Notice that your workspace will notify you about the process running in the background. This is the LLM configured to serve our queries whenever we connect the Continue extension to it.

    Editor interface showing a notification that RamaLama is running an AI model in the background.
    Figure 5. RamaLama running the AI model served in the background.
  4. Once the Continue extension gets installed, you can open it by clicking the Continue icon on the left sidebar. Click on the remain local option. Your workspace already contains the configuration to connect to the LLM running locally.

    Editor interface with the Continue extension open, showing it connected to the IBM Granite model served by RamaLama.
    Figure 6. Continue extension connected to the IBM Granite model served by RamaLama.
  5. Once you're connected to an LLM, you will be presented with a text box where you can issue prompts for the Large Language Model. Type a query in the chatbox to see how your AI assistant responds (Figure 7).

    Editor interface with the Continue extension's chatbox, demonstrating a query being typed to request a data validation program.
    Figure 7. Using the Continue extension to write a simple data validation program.

You can then issue more prompts to get results as per your requirements.

Previous resource
Introduction to RamaLama
Next resource
Understanding the Devfile