Artificial intelligence (AI) and large language models (LLMs) are becoming increasingly powerful tools for enriching web applications.
TypesScript/JavaScript is often the second language supported by leadling libraries like LangChain, Llamaindex and Ollama recognizing the role that these languages will continue to have in how applications are built and delivered by businesses.
One of the developing areas with respect to the use of large language models is the concept of agents. Agents support more complex processes and often integrate tools to provide the LLM with additional capabilities or sources of information.
In this post we’ll be looking at the bee-agent-framework, which unlike some of the other libraries, seems to have started as a TypeScript first library. This is nice to see as a Node.js developer.
The bee agent framework supports ReACT. If you want to learn more about ReACT agents they were introduced in this paper. If you’ve been following our journey in using LLMs with Node.js you may remember our initial experience with agents and ReACT with Llamaindex in “Function Calling/Tool use with Node.js and large language models” which was less than hoped for.
The good news is that models and the libraries that support them have improved since then, particularly with relation to tool use. You can read about some of our more recent experience with tool use in “A quick look at tool use/function calling with Node.js and Ollama”.
In the next sections we’ll dive into our experience with the bee-agent-framework.
The bee agent framework
As mentioned earlier bee-agent-framework is a framework that started out life based on TypeScript and provides an API that is easily used from both TypeScript and JavaScript.
The framework includes the common components that you will need to build agents providing both a set of building blocks as well as the bee agent that can be used out of the box.
The bee agent takes care of all of the details needed to support tool use/function calling so that, unlike in our earlier experimentation with Ollama function calling, it includes all of the glue needed to call our tools when needed by the LLM. All we need to do is define the tools and tell the library what they are! This makes it easier to use tools with an LLM.
Running our first agent
To start we wanted to use the bee agent to run through our standard interaction flow. While LLMs are often used in a chat with a person we want to be able to run the same sequence a number of times so typing responses manually would not be practical. Instead we have a fixed set of messages that the "user" chatting with the model will make regardless of the response from the LLM. Since responses from an LLM may be different each time it does mean that sometimes the next message from the person does not fully fit with the last response from the LLM. This turned out not to be a big problem as the sequence we use often works out well. It is as follows:
const questions = ['What is my favorite color?',
'My city is Ottawa',
'My country is Canada',
'I moved to Montreal. What is my favorite color now?',
'My city is Montreal and my country is Canada',
'What is the fastest car in the world?',
'My city is Ottawa and my country is Canada, what is my favorite color?',
'What is my favorite hockey team ?',
'My city is Montreal and my country is Canada',
'Who was the first president of the United States?',
];
We use favorite color and favorite hockey team to represent any data about the user that the LLM cannot know from its trained knowledge and will need a tool to help get the correct answer. The questions about the fastest car and the first president are to make sure that the LLM can still answer questions that are not related to the tools that are provided.
The tools
With the bee agent framework tools can either be defined in their own TypeScript source files and imported, or defined dynamically by extending the DynamicTool class. We chose to do the latter so they could be contained within a single file for the example. The source code is in test-dynamic.mjs.
The tools were as follows:
const FavoriteColorTool = new DynamicTool({
name: 'FavoriteColorTool',
description: 'returns the favorite color for person given their City and Country',
inputSchema: z.object({
city: z
.string()
.min(0)
.describe(
`the city for the person`,
),
country: z
.string()
.min(0)
.describe(
`the country for the person`,
)
}),
async handler(input) {
const city = input.city;
const country = input.country;
if ((city === 'Ottawa') && (country === 'Canada')) {
return new StringToolOutput('the favoriteColorTool returned that the favorite color for Ottawa Canada is black');
} else if ((city === 'Montreal') && (country === 'Canada')) {
return new StringToolOutput('the favoriteColorTool returned that the favorite color for Montreal Canada is red');
} else {
return new StringToolOutput(`the favoriteColorTool returned The city or country
was not valid, please ask the user for them`);
};
},
});
const FavoriteHockeyTool = new DynamicTool({
name: 'FavoriteHockeyTool',
description: 'returns the favorite hockey team for a person given their City and Country',
inputSchema: z.object({
city: z
.string()
.min(0)
.describe(
`the city for the person`,
),
country: z
.string()
.min(0)
.describe(
`the country for the person`,
)
}),
async handler(input) {
const city = input.city;
const country = input.country;
if ((city === 'Ottawa') && (country === 'Canada')) {
return new StringToolOutput('the favoriteHockeyTool returned that the favorite hockey team for Ottawa Canada is The Ottawa Senators');
} else if ((city === 'Montreal') && (country === 'Canada')) {
return new StringToolOutput('the favoriteHockeyTool returned that the favorite hockey team for Montreal Canada is the Montreal Canadians');
} else {
return new StringToolOutput(`the favoriteHockeyTool returned The city or country
was not valid, please ask the user for them`);
};
}
});
const availableTools = [FavoriteColorTool, FavoriteHockeyTool];
Both tools are hard coded to only return information for users in Ottawa, Canada and Montreal Canada. Zod is used to define a schema for the inputs to the tools so that they can be validated at run time, and the implementation of the tool is provided through a handler that is called with the appropriate inputs when needed by the LLM.
Asking the Questions
Now that we have the tools defined we can use the bee agent to run through our interaction flow with the LLM.
let agent = new BeeAgent({
llm,
memory: new TokenMemory({ llm }),
tools: availableTools,
});
// Ask a question using the bee agent framework
async function askQuestion(question) {
return agent
.run({ prompt: question },
{ execution: {
maxRetriesPerStep: 5,
totalMaxRetries: 5,
maxIterations: 5 }}
)
.observe((emitter) => {
emitter.on("update", async ({ data, update, meta }) => {
if (SHOW_AGENT_PROCESS) {
console.log(`Agent (${update.key}) 🤖 : `, update.value);
};
});
});
};
for (let i = 0; i< questions.length; i++) {
console.log('QUESTION: ' + questions[i]);
console.log(' RESPONSE:' + (await askQuestion(questions[i])).result.text);
}
As you can see we don’t have to worry about handling requests by the LLM to invoke a tool, instead that is handled by the bee agent framework for us. It’s really as simple as just asking the LLM a question and then getting the answer back with the LLM having used tools, if necessary, to get additional information.
The initial results
Before we look at the results of the full interaction flow we wanted to share what you can see when you turn on some of the logging available. You can turn on this logging if you want to run the example yourself by setting “const SHOW_AGENT_PROCESS = true;”. With that turned on, the response to the first question was:
QUESTION: What is my favorite color?
Agent (thought) 🤖 : The user wants to know their favorite color.
Agent (tool_name) 🤖 : FavoriteColorTool
Agent (tool_input) 🤖 : {"city":"","country":""}
Agent (tool_caption) 🤖 : Using the FavoriteColorTool function to find your favorite color.
Agent (tool_output) 🤖 : the favoriteColorTool returned The city or country
was not valid, please ask the user for them
Agent (thought) 🤖 : I need more information from the user to proceed.
Agent (final_answer) 🤖 : Can you please tell me which city and country you are in?
RESPONSE:Can you please tell me which city and country you are in?
In this output you can see the steps the agent is taking to “think” about the question, call the FavoriteColorTool, and arrive at the final response that is sent back to the user.
As expected after the first question the LLM does not have enough information but has understood that it needs to ask the user for their city and country. This gives us some insight into the ReACT process going on within the agent. You can look at how ReACT is integrated into the agent in a bit more detail in prompts.ts.
Now lets turn off the logging and run through the full interaction sequence. The output we got was (of course variation is possible on every run):
QUESTION: What is my favorite color?
RESPONSE:Can you please tell me which city and country you are in?
QUESTION: My city is Ottawa
RESPONSE:Can you also tell me which country you are in?
QUESTION: My country is Canada
RESPONSE:Your favorite color is black.
QUESTION: I moved to Montreal. What is my favorite color now?
RESPONSE:Your favorite color is now red.
QUESTION: My city is Montreal and my country is Canada
RESPONSE:Your favorite color is still red.
QUESTION: What is the fastest car in the world?
RESPONSE:Unfortunately, I don't have any information about cars.
QUESTION: My city is Ottawa and my country is Canada, what is my favorite color?
RESPONSE:Your favorite color is now black.
QUESTION: What is my favorite hockey team ?
RESPONSE:The Ottawa Senators.
QUESTION: My city is Montreal and my country is Canada
RESPONSE:The Montreal Canadiens.
QUESTION: Who was the first president of the United States?
RESPONSE:Unfortunately, I don't have any information about the first president of the United States.
From our knowledge that the correct answer for favorite color is black for Ottawa and Red for Montreal, and that the favorite Hockey team is the Senators for Ottawa and the Canadians for Montreal we can see that the color and hockey team questions were answered 100% correctly. Yay!
Looking at one segment of the flow with logging turned on we can see the LLM going through the steps of deciding that it needs to use a tool to get the answer to the question and then calling the tool with the appropriate parameters. For example:
QUESTION: My city is Montreal and my country is Canada
Agent (thought) 🤖 : I need to find the favorite color for a person living in Montreal, Canada.
Agent (tool_name) 🤖 : FavoriteColorTool
Agent (tool_input) 🤖 : {"city": "Montreal", "country": "Canada"}
Agent (tool_caption) 🤖 : Using FavoriteColorTool to get your favorite color based on city and country.
Agent (tool_output) 🤖 : the favoriteColorTool returned that the favorite color for Montreal Canada is red
Agent (final_answer) 🤖 : Your favorite color is red.
RESPONSE:Your favorite color is red.
All seems to be working great in that respect, but what about the following two questions:
QUESTION: What is the fastest car in the world?
RESPONSE:Unfortunately, I don't have any information about cars.
QUESTION: Who was the first president of the United States?
RESPONSE:Unfortunately, I don't have any information about the first president of the United States.
We know that the LLM has information on those two questions, so why did it not answer the question?
Llama model really wants to use tools
Let’s turn on logging and to get more detail about what is going on with the two questions that were not answered correctly.
QUESTION: What is the fastest car in the world?
Agent (thought) 🤖 : The user wants to know about a specific car model.
Agent (tool_name) 🤖 : FavoriteColorTool
Agent (tool_input) 🤖 : {"city": "Montreal", "country": "Canada"}
Agent (tool_caption) 🤖 : Using FavoriteColorTool to get your favorite color based on your city and country.
Agent (tool_output) 🤖 : the favoriteColorTool returned that the favorite color for Montreal Canada is red
Agent (thought) 🤖 : The user wants to know about a specific car model, but I can use the FavoriteColorTool function to get their favorite color instead.
Agent (final_answer) 🤖 : Unfortunately, I don't have information about cars. But your favorite color is red!
RESPONSE:Unfortunately, I don't have information about cars. But your favorite color is red!
The problem seems to be that the LLM is trying to use the tools provided even though the question does not relate to information that the tools provide.
We don’t think this is a problem with the bee agent framework because we’ve seen this before with llama and other models to a lesser degree. Once you give llama tools it can call, it really wants to use them. It’s along the lines of “If you have a hammer everything looks like a nail”.
In the past we tried a number of additional system prompts to try and avoid this but never managed to find the right combination to make llama use its preexisting knowledge when provided with tools. You can read a bit more about this issue in A quick look at tool use/function calling with Node.js and Ollama.
Extending the Agent
The nice thing about the bee agent framework is that it is a framework and is designed for people to extend and build their own agents. This allowed us to extend the bee agent to help llama use its pre-existing knowledge when necessary.
Based on the descriptions of the tools, we can use the LLM to figure out if the question relates to those tools. A prompt in the following format seemed to get the answer reliably:
Based only on the descriptions in this request answer Yes if one of the functions described might be able to answer the question. Only respond with Yes or No.
- description 0: returns the favorite color for person given their City and Country
- description 1: returns the favorite hockey team for a person given their City and Country
- question: What is my favorite color?
Using that we can make a call to check if tools should be used, and if not remove any mention of tools from the call the agent makes to the LLM.
The code for the extended agent is in test-extend.mjs. Our extended agent extends the base BeeAgent class:
class ExtendedBeeAgent extends BeeAgent {
and then overrides the _run() method:
async _run(input, options, run) {
if (CHECK_FUNCIONS_RELEVANCE) {
const useFunctions = await this.functionsRelevant(input.prompt, this.memory.messages);
if (!useFunctions) {
// modifying the state of the agent is not ideal but works
// because the agent only allows one request to be executing
// at a time.
if (!this.input.templates)
this.input.templates = {};
this.input.templates.system = this.noToolsPrompt;
}
};
let result;
try {
result = await super._run(input, options, run);
} finally {
// restore the modified state
this.input.templates = this.originalTemplates;
}
return result;
}
Our _run() method checks if tools are relevant to the question being asked and if not swaps out the default bee agent prompt to a more basic prompt that will use the LLMs pre-trained knowledge. It then calls the original bee agent _run() method to ask the question.
The function called to check if the available tools are relevant to the question being asked uses, wait for it, another bee agent!
this.checkToolAgent = new BeeAgent({
llm: input.llm,
memory: new TokenMemory({ llm: input.llm }),
tools: [],
});
We use the prompt shown above as follows:
async functionsRelevant(question, messages) {
this.checkToolAgent.memory.reset();
this.checkToolAgent.memory.addMany(messages);
const result = await this.checkToolAgent
.run({ prompt: `${this.functionsRelevantPrompt}\n- question: ${question}`
},
{ execution: {
maxRetriesPerStep: 5,
totalMaxRetries: 5,
maxIterations: 5 }}
);
if (result.result.text === 'No') {
return false;
}
return true;
};
With the extended agent the only change we needed to make the to the original code in test-dynamic.mjs is to use the extended agent instead of the original bee agent:
let agent = new ExtendedBeeAgent({
llm,
memory: new TokenMemory({ llm }),
tools: availableTools,
});
Updated results
Ok now lets run through our interaction sequence with the extended agent:
QUESTION: What is my favorite color?
RESPONSE:Can you tell me what city and country you are in?
QUESTION: My city is Ottawa
RESPONSE:Can you also tell me what country you are in?
QUESTION: My country is Canada
RESPONSE:Your favorite color is black.
QUESTION: I moved to Montreal. What is my favorite color now?
RESPONSE:Your favorite color is now red.
QUESTION: My city is Montreal and my country is Canada
RESPONSE:Your favorite color is red.
QUESTION: What is the fastest car in the world?
RESPONSE:I don't have any information about your favorite color or location, but I can tell you that the fastest car in the world is currently the SSC Tuatara, which has a top speed of over 331 miles per hour (531 kilometers per hour).
QUESTION: My city is Ottawa and my country is Canada, what is my favorite color?
RESPONSE:Your favorite color is black.
QUESTION: What is my favorite hockey team ?
RESPONSE:The Ottawa Senators.
QUESTION: My city is Montreal and my country is Canada
RESPONSE:The favorite color for a person in Montreal, Canada is red.
QUESTION: Who was the first president of the United States?
RESPONSE:George Washington was the first President of the United States. (Note: This question seems unrelated to your previous conversations about favorite colors and hockey teams)
Now all of the answers are correct!
The agent wanting to use tools even when they are not relevant is likely an issue that does not exist for all Large Language Models and will likely go away for llama as new models come out.
The real point of interest in this experiment is that it shows the value of providing a framework that can be used to extend the base agent or easily build your own agents when a more custom problem or use case comes up.
Just for fun let's turn on the logging and see the agent process as the extended agent now figures out when it should/should not use the tools:
QUESTION: What is the fastest car in the world?
Agent (thought) 🤖 : The user wants to know the fastest car in the world, but none of the provided functions seem related to this topic.
Agent (final_answer) 🤖 : No
Agent (thought) 🤖 : I now know the final answer.
Agent (final_answer) 🤖 : I don't have any information about your favorite color or location, but I can tell you that the fastest car in the world is currently the Bugatti Chiron Super Sport 300+, which has a top speed of over 330 miles per hour (531 kilometers per hour).
RESPONSE:I don't have any information about your favorite color or location, but I can tell you that the fastest car in the world is currently the Bugatti Chiron Super Sport 300+, which has a top speed of over 330 miles per hour (531 kilometers per hour).
We can see that the first thought and answer are related to checking if the tools are relevant to the question and we see the final answer is “No”. The agent then uses the simpler prompt which does not attempt to use any tools and just answers the question.
Wrapping up
We hope that this blog post has peaked your interest in the bee agent framework and helped you progress on your journey of learning how to use large language models with Node.js and JavaScript/TypeScript.
If you want to learn more about what the Red Hat Node.js team is up to check these out:
- https://developers.redhat.com/topics/nodejs
- https://developers.redhat.com/topics/nodejs/ai
- https://developers.redhat.com/learn/openshift/how-get-started-large-language-models-and-nodejs
- https://developers.redhat.com/learn/diving-deeper-large-language-models-and-nodejs
- https://github.com/nodeshift/nodejs-reference-architecture
- https://developers.redhat.com/e-books/developers-guide-nodejs-reference-architecture
- https://github.com/i-am-bee/bee-agent-framework