Dive deeper into large language models and Node.js

Explore how to use large language models (LLMs) with Node.js by observing Ollama, LlamaIndex, function calling, and agents.

Let’s begin by exploring how we can utilize Ollama to run LLMs.

Prerequisites:

  • An environment where you can install and run Node.js.
  • An environment where you can install and run Ollama.
  • Git client

In this lesson, you will:

  • Install Node.js.
  • Install Ollama. 
  • Clone the ai-experimentation repository.
  • Run a simple query over a hosted Mistral LLM model.

Set up the environment

If you don’t already have Node.js installed, install it using one of the methods outlined on their download page.

Clone the ai-experimentation repository with the following command:

git clone https://github.com/mhdawson/ai-experimentation

Download and install Ollama following the download and install instructions. Supported platforms include macOS, Linux, and Windows.

An introduction to Ollama

In How to get started with large language models and Node.js, we ran an LLM locally using node-llama-cpp. Through the magic of Node.js add-ons and node-addon-api (which we help maintain), it loaded the LLM into the same process as the Node.js application running. 

This was a fast and easy way to get started because it avoided installing and starting a separate application to run the LLM. However, in most cases, we don’t want an LLM running in each of our Node.js processes due to the potential memory usage, and also because of the additional time to load the LLM when we start our Node.js application.

Ollama is a tool that lets you easily spin up a process that serves an LLM through a connection on a TCP port. In addition, it provides a command-line tool to download LLMs. It supports Linux, Windows, and MacOS, and it’s already set up to leverage a GPU if one is available. 

Take a look at the Ollama help by running Ollama without any arguments:

ollama
Usage:
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

Flags:
  -h, --help      help for ollama
  -v, --version   Show version information

Use "ollama [command] --help" for more information about a command.

To stick to an LLM similar to the first learning path, we will use Mistral. Run the following command to pull the default Mistral LLM. It may take a few minutes depending on your connection speed since the LLM file is 4.1G in size:

ollama pull mistral

You should now be able to see the LLM in the list of LLMs available:

ollama list
NAME            ID              SIZE    MODIFIED
mistral:latest  2ae6f6dd7a3d    4.1 GB  About a minute ago

Depending on your operating system, Ollama may automatically start to serve LLMs after you pull an LLM. If not, start Ollama with: 

ollama serve

What’s nice about Ollama is that in addition to serving the LLM (by default, on localhost and port 11434), it also manages the LLMs so they stay in memory when used and unload when not used. You can see the running LLMs with ollama ps:

ollama ps
NAME    ID      SIZE    PROCESSOR       UNTIL

At the moment, there are no LLMs running because nothing is consuming them. In the next step, we will start a Node.js application that consumes one.

Run the basic Langchain.js example with Ollama

Before we move on to exploring other libraries like LlamaIndex.ts, we’ll first show that it's easy to switch to accessing an LLM served by Ollama. This is similar to switching to a Langchain.js-based application, between running with node-llama-cpp, OpenAI, and Red Hat OpenShift AI.

Start by changing into the lesson-5 subdirectory:

cd lesson-5

In that directory, you will find a file called langchainjs-ollama.mjs. If you went through the first learning path, you will recognize the get-model function, which we have extended to have an option for Ollama:

  } else if (type === 'ollama') {
    ////////////////////////////////
    // Connect to ollama endpoint
    const { Ollama } = await import("@langchain/community/llms/ollama");
    model = new Ollama({
      baseUrl: "http://10.1.1.39:11434", // Default value
      model: "mistral", // Default value
    });
  };

You can see that we specify the baseURL as the remote machine where we are running Ollama and ask for the Mistral LLM that we pulled in the earlier step. If you are running the examples on the same machine on which you ran Ollama, set the address to 127.0.0.1. Otherwise, add the address of the Ollama application. This should be updated anywhere in the code that refers to an IP endpoint for the Ollama engine.

The application simply asks the LLM the question, “Should I use npm to start a Node.js application?”

import { ChatPromptTemplate } from "@langchain/core/prompts";
import path from "path";
import {fileURLToPath} from "url";

////////////////////////////////
// GET THE MODEL
const model = await getModel('ollama', 0.9);
//const model = await getModel('llama-cpp', 0.9);
//const model = await getModel('openAI', 0.9);
//const model = await getModel('Openshift.ai', 0.9);

////////////////////////////////
// CREATE CHAIN
const prompt =
  ChatPromptTemplate.fromTemplate(`Answer the following question if you don't know the answer say so:

Question: {input}`);

const chain = prompt.pipe(model);

////////////////////////////////
// ASK QUESTION
console.log(new Date());
let result = await chain.invoke({
  input: "Should I use npm to start a node.js application",
});
console.log(result);
console.log(new Date());

Install the packages required for the application with:

npm install

Then, run the application with:

node langchainjs-ollama.mjs

The application will respond with an answer from the Mistral LLM model hosted by Ollama.

In the next lesson, we’ll incorporate LlamaIndex.ts to take things a step further.

Previous resource
Overview: Dive deeper into large language models and Node.js
Next resource
Get to know LlamaIndex.ts with Node.js