Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

A quick look at large language models with Node.js, Podman Desktop, and the Granite model

July 22, 2024
Michael Dawson
Related topics:
Artificial intelligenceDeveloper ToolsNode.jsRuntimes
Related products:
Developer ToolsPodman DesktopRed Hat build of Node.jsRed Hat OpenShift AI

Share:

    We’ve been learning a lot about large language models (LLMs) and how they can be used with Node.js and JavaScript. If you want to follow what we’ve learned, you can check out this learning path that we put together on our journey so far: How to get started with large language models and Node.js.

    In our initial learning we followed the most common path by using one of the most popular models from Hugging Face and using tools like ollama and node-llama-cpp to run the models. However, not all models are created equal, and there are a number of aspects that should be considered when choosing a model. The article "Why trust open source AI" is a good introduction to a number of them.

    In that context we wanted to try out the Granite model and see what if any differences there were from using the models we’d used previously. We also thought it would be a good opportunity to explore Podman AI Lab as another way to run models.

    Getting the Granite large language model running with Podman

    We started by installing Podman Desktop. Podman Desktop provides a nice GUI where you can start and manage containers. It supports Windows, macOS, and Linux. Downloads are available from podman-desktop.io/downloads.

    Once Podman Desktop was installed we went to the extensions pages, searched for “Podman AI,” and installed the Podman AI Lab extension as shown in Figure 1.

    Picture of the podman Extensions page, with the search criteria set to "podman ai" and the tile for Podman AI Lab being displayed
    Figure 1: Installing Podman AI Lab extension.

    Once the AI Lab extension was installed we could see a new icon in the bar on the left as shown in Figure 2.

    Picutre of podman application with a circle around the icon for Podman AI Lab, and the Podmain AI lab main page being displayed
    Figure 2: New Podman AI Lab icon.

    From that page we can go to the Catalog page. The Catalog page allows us to download the Granite model along with many other popular options, as shown in Figure 3.

    Picture of podman AI lab extension Catalog page which shows the models available to be downloaded and their sizes.
    Figure 3: Podman AI Lab catalog page with the models available for download.

    We used the download option on the far right hand side to download the Granite model. This can take a bit of time as the download is 3.8G in size. Once you ask for the model to be downloaded it will begin to download as shown in Figure 4.

    Picutre of Podman AI lab Catalog page listing the models that can be downloaded, with the granite model being actively downloaded.
    Figure 4: Downloading the Granite model.

    Models can also be downloaded from Hugging Face and imported, but it is nice that it's easy to download the Granite and other popular models without having to do that.

    Once the model is downloaded, Podman AI Lab allows us to easily serve the model with an OpenAI compatible endpoint using a service.

    Figure 5 shows the Services page.

    Picture of podman AI lab service page with default text indicating there are no services and providing a link to create one.
    Figure 5: Podman AI Lab default service page.

    We asked that a new service be created, which will serve the Granite model we had downloaded earlier. This is shown in Figure 6.

    Picture of the podman AI lab Service page for creating a service with the granite model specified and port 36851 as the container port
    Figure 6: Creating a service to serve the Granite model.

    Once complete, we can open the service details to get the URL that we’ll need to access the model, as shown in Figure 7.

    Picture of the podman AI lab service details page showing the URL for the service that was created as including a sample curl command for accessing it.
    Figure 7: Service details for the newly created service that serves the Granite model.

    The URL being: http://localhost:36851/v1/chat/completions. We now have the Granite model being served by an endpoint available locally.

    Trying out the Granite large language model with Node.js

    Next we wanted to see how the Granite model worked with the Node.js and Langchain.js based Retrieval Augmented Generation (RAG) example that we had experimented with earlier. 

    We started by cloning the ai-experimentation repository:

      git clone https://github.com/mhdawson/ai-experimentation.git

    We then went into the lesson-3-4 directory and edited the file langchainjs-backend.mjs to point to the local URL on which the Granite model was being served by Podman. This diff shows the changes we made:

      diff --git a/lesson-3-4/langchainjs-backends.mjs b/lesson-3-4/langchainjs-backends.mjs
        index dd71cb7..e6048aa 100644
        --- a/lesson-3-4/langchainjs-backends.mjs
        +++ b/lesson-3-4/langchainjs-backends.mjs
        @@ -42,9 +42,9 @@ console.log("Augmenting data loaded - " + new Date());
        ////////////////////////////////
        // GET THE MODEL
        -const model = await getModel('llama-cpp', 0.9);
        +//const model = await getModel('llama-cpp', 0.9);
        //const model = await getModel('openAI', 0.9);
        -//const model = await getModel('Openshift.ai', 0.9);
        +const model = await getModel('Openshift.ai', 0.9);
        ////////////////////////////////
        @@ -112,7 +112,7 @@ async function getModel(type, temperature) {
              { temperature: temperature,
                openAIApiKey: 'EMPTY',
                modelName: 'mistralai/Mistral-7B-Instruct-v0.2' },
        -      { baseURL: 'http://vllm.llm-hosting.svc.cluster.local:8000/v1' }
        +      { baseURL: 'http://localhost:36851/v1' }
            );
          };
          return model;

    If you had looked at langchainjs-backend.mjs earlier you would have seen that it already supported switching between accessing a model served through node-llama-cpp, OpenAI, or Red Hat OpenShift AI. As the option for OpenShift AI used an OpenAI compatible endpoint, and the Podman AI Lab service also provides an OpenAI compatible endpoint all we had to do was:

    • Switch the call to getModel() to use the "Openshift.ai" option.
    • Switch the basedURL configured for the OpenShift.ai option in getModel() to point to the base URL served by Podman AI Lab. From the URL we shared earlier, that would be:

      http://localhost:36851/v1

    With those changes, we then ran the application with:

      node langchainjs-backends.mjs

    From the output we can see the data for the Node.js reference architecture being loaded in order to support Retrieval Augmented Generation, the question being asked to the model, and the model responding with an answer that has been influenced by the Node.js reference architecture:

      Loading and processing augmenting data - Wed Jul 03 2024 15:51:33 GMT-0400 (Eastern Daylight Saving Time)
        Unknown file type: cors-error.png
        Unknown file type: _category_.json
        Unknown file type: _category_.json
        Unknown file type: _category_.json
        Augmenting data loaded - Wed Jul 03 2024 15:51:44 GMT-0400 (Eastern Daylight Saving Time)
        Loading model - Wed Jul 03 2024 15:51:44 GMT-0400 (Eastern Daylight Saving Time)
        2024-07-03T19:51:44.341Z
        {
       input: 'Should I use npm to start a node.js application',
       chat_history: [],
       context: [
         Document {
           pageContent: '## avoiding using `npm` to start application\n' +
             '\n' +
             'While you will often see `CMD ["npm", "start"]` in docker files\n' +
             'used to build Node.js applications there are a number\n' +
             'of good reasons to avoid this:',
           metadata: [Object]
         },
         Document {
           pageContent: "- One less component. You generally don't need `npm` to start\n" +
             '  your application. If you avoid using it in the container\n' +
             '  then you will not be exposed to any security vulnerabilities\n' +
             '  that might exist in that component or its dependencies.\n' +
             '- One less process. Instead of running 2 process (npm and node)\n' +
             '  you will only run 1.\n' +
             '- There can be issues with signals and child processes. You\n' +
             '  can read more about that in the Node.js docker best practices',
           metadata: [Object]
         },
         Document {
           pageContent: '```\n' +
             '\n' +
             'It should be noted that users and organizations can modify how `npm init` works, tailoring the resulting package.json to their needs.  For more information on this, check out the [official docs](https://docs.npmjs.com/cli/v9/commands/npm-init)',
           metadata: [Object]
         },
         Document {
           pageContent: '* [Introduction to the Node.js reference architecture: Node Module Development](https://developers.redhat.com/articles/2023/02/22/installing-nodejs-modules-using-npm-registry)',
           metadata: [Object]
         }
       ],
       answer: 'It is generally recommended to avoid using `npm` to start a Node.js application. While it may be convenient to use the `CMD ["npm", "start"]` syntax in Dockerfiles, there are several reasons to consider using an alternative approach:\n' +
         '\n' +
         '- **Simplified deployment**: By not relying on `npm`, you can avoid potential security vulnerabilities and reduce the number of components required to run your application. Additionally, you will only need to manage one process instead of two.\n' +
         '- **Easier signal handling**: With a direct `node` command, you can better control signal handling and child processes, which can be especially important in production environments.\n' +
         '\n' +
         'However, it is essential to acknowledge that users and organizations can customize the `npm init` process to suit their specific needs. For more information on tailoring package.json, refer to the official [docs](https://docs.npmjs.com/cli/v9/commands/npm-init).\n' +
         '\n' +
         'In summary, while `npm` can be a convenient option, there are valid reasons to consider using an alternative approach for starting a Node.js application. The choice ultimately depends on your specific use case and requirements.'
        }
        2024-07-03T19:52:27.207Z

    Note that the run was a bit longer since we were not using GPU acceleration and were on a smaller machine.  

    Just like past runs with the mistral model we get an answer (the part after answer:) telling us to avoid using the npm command to start Node.js applications. This is a different answer than we get without the additional context provided from the Node.js reference architecture.

    If you’ve not gone through the learning path and want to dive deeper into what a Node.js application using Langchain.js and Node.js looks like, you can look through the code in langchainjs-backend.mjs. It was good to see that we could use the existing Langchain.js based application with another method for serving a model (Podman AI desktop) and with a different model (Granite)

    Wrapping up

    As we mentioned in the introduction, we wanted to try out the Granite model and see what if any differences there were from using the models we’d used previously, and to do that using Podman AI Lab as another way to run models.

    As you can see, the experience of using Podman AI Lab and the Granite model went well. It was easy to download and serve the model with Podman AI Lab, and the Granite model worked as expected with the Node.js and Langchain.js based application that implemented Retrieval Augmented Generation (RAG).

    If you want to learn more about Node.js and AI you can check out AI & Node.js on Red Hat Developer.

    If you would like to learn more about what the Red Hat Node.js team is up to in general, you can check out the Node.js topic page and the Node.js reference architecture.

    Last updated: July 24, 2024

    Related Posts

    • How to use LLMs in Java with LangChain4j and Quarkus

    • Introducing Podman AI Lab: Developer tooling for working with LLMs

    • Feeding LLMs efficiently: data ingestion to vector databases with Apache Camel

    • Getting started with Podman AI Lab

    • Experiment and test AI models with Podman AI Lab

    Recent Posts

    • How Trilio secures OpenShift virtual machines and containers

    • How to implement observability with Node.js and Llama Stack

    • How to encrypt RHEL images for Azure confidential VMs

    • How to manage RHEL virtual machines with Podman Desktop

    • Speech-to-text with Whisper and Red Hat AI Inference Server

    What’s up next?

    Learn how to access a large language model using Node.js and LangChain.js. You’ll also explore LangChain.js APIs that simplify common requirements like retrieval-augmented generation (RAG).

    Start the activity
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue