What is all the hype around AI Agents? Let's build an AI Agent

Disclaimer, I’m no ML engineer, just a curious person :) Feel free to add/correct any information!

The idea of this post is to give a little overview about LLMs and its tooling in simple terms, as well as the hype around it. In the end we will create a simple AI agent to show how we can power real life solutions with LLM.

What are LLMs?

You might have heard about LLM (Large Language ~~Magic✨~~ Models), they are AI models that are trained in a huge datasets using Deep learning techniques. This way, they are trained to perform in a big range of applications being capable to reason the text input, search in a infinite range of data and predict what would be the best response. Also, if needed, reason about complex data to provide the answer.

Ok but and the hype about it?

ai everywherebuzzlightyear meme

Lately we have been seeing lately many companies becoming suddenly “AI-first” or “AI driven”, this followed by innumerous AI SaaS companies and AI rockstars like Sam Altman. Also if you go to Product Hunt page, website that showcases releases of products, lately all of the products/companies are AI driven or its build upon AI capabilities.

product hub last month august, all companies AI related

News about Cisco having efficient measures towards focusing on AI and Cybersecurity

Gartner Survey Finds 55% of Organizations That Have Deployed AI Take an AI-First Strategy with New Use Cases

Its the AI train! Either you hop in, or you’re behind ¯\_(ツ)/¯ but why is it?

meme of carton dog building the railroads and the train advances

One big advantage of LLMs/“AI” solutions is that they leverage a tool that we are very used to do stuff: Talk.

Due to the capabilities of AI to understand natural language, we can achieve results that before was restricted only for tech experienced people, by just talking to a machine. This way, we can ask/question the machine and it produces magically✨ the result. This makes access to systems interactions and sometimes data analysis more accessible to everyone.

We had things in the past that mimic this like the Brazen Head
but was played by a person, and now we have the chance to have a real machine doing it. R.I.P. Akinator

Using “AI” and LLMs

Nowadays we are getting to use Chat GPT
and other LLMs to help in daily tasks and get answers fast. And one important thing I’ve learned about LLMs is that we have Models and Tools to enhance models, this way when you ask:

Tell me how many medals Simone Biles got in the 2024 Paris Olympics

You get:

Simone Biles won four medals at the 2024 Paris Olympics: three golds and one silver. She secured gold medals in the team event, individual all-around, and vault. Her silver medal came from the individual floor exercise. With these victories, Biles extended her record as the most decorated gymnast of all time, bringing her total Olympic medal count to 11

And we can notice something in the response

Chat gpt response with simone biles medals, and say 2 web serches needed to do the answer

So the model searched 2 websites, thats awesome!! So…Not exactly that…

One thing that at first for me at first was magical, but now I understood. Is that models like model gpt4o
doesn’t know nothing about the 2024 olympic games, since it was trained with data up to Up to Oct 2023. So in order for it to know we must search the web to do it, and for that we use AI Agents ✨🤖 to find the information to provide the LLM. Here is a simplified diagram of the flow

llm ReAct agent workflow sketch

This is one example of the techniques of Prompt Engineering
leveraging RAG
and ReAct. We can define Prompt engineering as the follow:

Prompt engineering skills help to better understand the capabilities and limitations of LLMs. Researchers use prompt engineering to improve safety and the capacity of LLMs on a wide range of common and complex tasks such as question answering and arithmetic reasoning. Developers use prompt engineering to design robust and effective prompting techniques that interface with LLMs and other tools.

But enough about concepts lets build one!

Demo time! Lets create a simple AI Agent

Code here
Jupyter Notebook in Colab

So lets see how we can create your own LLM interactions.

Setup of tools

First of all we need to create some credentials in some tools needed, besides OpenAI all of them have generous free tiers and Open AI you can use credits to avoid skyrocket bills.

Open AI
- Video about adding Credits to OpenAI
- Open AI platform Dashboard
- Credit Card required
Tavily
- No Credit Card required
- 1,000 API calls (Monthly)
- API
LangSmith
- Video About LangSmith
- No Credit Card required

Installing dependencies

With the accounts setup we can start the notebook. We gonna be using a Python Notebook to run this code. I suggest using Google Colab
that is a free python notebook on the web.

!pip install langchain -qqq
!pip install langchain-openai -qqq
!pip install langchain_community -qqq
!pip install tavily-python -qqq

After installing the dependencies we can setup the API keys for this demo

import getpass
import os

# Recommend using LangSmith as it provides LOTS of Tracing capabilities!

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass(prompt='LangChain API KEY: ')
os.environ["LANGCHAIN_PROJECT"] = getpass.getpass(prompt='LangChain Project: ')
os.environ["OPENAI_API_KEY"] = getpass.getpass(prompt='Open AI API Key: ')
os.environ["TAVILY_API_KEY"] = getpass.getpass(prompt='TAVILY API Key: ')

You gonna be prompted to add your credentials to load it into the application.

Configuring Open AI

With the credentials setup we can configure our model gpt4o-mini from OpenAI

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage

llm = ChatOpenAI(model="gpt-4o-mini")

And we can test to see how it goes

# Test the OpenAPI model

llm.invoke(input="Hi")

# Response =>

AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 8, 'total_tokens': 17}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_48196bc67a', 'finish_reason': 'stop', 'logprobs': None}, id='run-98a5962f-73de-4440-a075-12075ebfe65a-0', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17})

Defining What tools to be used

Now we are going to define the tools that will be used, in this case we will do the same simone biles question about the olympics. So we need to have access to internet, and for that we will use Tavily

from langchain_community.tools.tavily_search import TavilySearchResults

search = TavilySearchResults(max_results=2)
# If you want to test
# search_results = search.invoke("Simone Biles 2024 Paris Olympic games medals count")
# print(search_results)

# If you want to add more tools you can see the available tools in https://python.langchain.com/v0.1/docs/integrations/tools/
tools = [search]

Then we can test the web integration with

# test the tavily search and its response structure

search.run("Simone Biles 2024 Paris Olympic games medals count")

# =>

[{'url': 'https://olympics.com/en/news/simone-biles-all-titles-records-and-medals-complete-list-paris-2024',
  'content': "U.S. gymnastics superstar Simone Biles already owns more world and Olympic medals than any gymnast in history. The 27-year-old is the owner of 10 Olympic medals, including seven golds, and 30 world medals - 23 of which are gold. At the Olympic Games Paris 2024, the American has the chance to add up to two medals to her haul, having helped Team USA to gold in the women's final and taking the ..."},
 {'url': 'https://www.usatoday.com/story/sports/olympics/2024/07/28/simone-biles-how-many-olympics-appearances/74230944007/',
  'content': "The 2024 Paris Olympics are Simone Biles' third Olympic Games and she seems poised to add to her already-impressive medal count. ... Biles has competed in two other Olympic Games: the 2016 Rio ..."}]

Create ReAct AI Agent

For this tutorial we are going to be creating a ReAct AI Agent. It will be able to reason about the data provided and act if necessary to fetch more information. Also we will provide to the Agent a memory of 4 messages so it has context and if needed can use previous searched information to answer new questions.

from langchain.agents import create_react_agent
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain import hub
from langchain.agents import AgentExecutor

# Creating a memory for the Agent over a Chat interface, so if we call the agent it will remember the last messages

conversational_memory = ConversationBufferWindowMemory(
    memory_key='chat_history',
    k=4, # Number of messages stored in memory
    return_messages=True # Must return the messages in the response.
)

# Load prompt template, this creates a template in which Langchain will format the messages into to sent to the model.

## Must have a langchain key to fetch resources from the langchain hub.
prompt = hub.pull("hwchase17/react-chat") # https://smith.langchain.com/hub/hwchase17/react-chat?organizationId=c005d9c3-9654-5134-8ec9-122b5a609562
agent = create_react_agent(
    tools=tools,
    llm=llm,
    prompt=prompt,
)

# Create an agent executor by passing in the agent and tools

agent_executor = AgentExecutor(agent=agent, # The Base Agent, a ReAct agent
                               tools=tools, # Set of tools provided to the Agent Executor
                               verbose=True, # Verbosity to help understand and debug
                               memory=conversational_memory, # The memory for context in the LLM and history
                               max_iterations=5, # An Agent can loop itself using different Tools to get a Job done, this limits the max interactions to 5
                               max_execution_time=600, # During this interactions we can add a timeout during one interaction
                               handle_parsing_errors=True
                               )

Now its time to run the question!

agent_executor.invoke({"input": "How many medals Simone Biles got in the 2024 Paris olympics?"})['output']

We will get a view of the thinking process of this Agent printed alongside with the response:

> Entering new AgentExecutor chain...

Thought: Do I need to use a tool? Yes
Action: tavily_search_results_json
Action Input: "Simone Biles medals 2024 Paris Olympics"
[{'url': 'https://people.com/simone-biles-wins-four-medals-2024-olympics-8690143', 'content': "Simone Biles Ends Her 2024 Olympic Games with Four Medals: 'I've Accomplished Way More Than My Wildest Dreams'. Biles finished out her Paris Olympics with a silver medal in the individual floor ..."}, {'url': 'https://olympics.com/en/news/simone-biles-all-titles-records-and-medals-complete-list-paris-2024', 'content': "U.S. gymnastics superstar Simone Biles already owns more world and Olympic medals than any gymnast in history. The 27-year-old is the owner of 10 Olympic medals, including seven golds, and 30 world medals - 23 of which are gold. At the Olympic Games Paris 2024, the American has the chance to add up to two medals to her haul, having helped Team USA to gold in the women's final and taking the ..."}]
Do I need to use a tool? No
Final Answer: Simone Biles won a total of four medals at the 2024 Paris Olympics, including a silver medal in the individual floor exercise and gold medals with Team USA in the women's final.

> Finished chain.
Simone Biles won a total of four medals at the 2024 Paris Olympics, including a silver medal in the individual floor exercise and gold medals with Team USA in the women's final.

As you could see the AI Agent followed the same structured showed initially

llm ReAct agent workflow sketch

If you want to ask more questions you can use the following:

# Here a custom invoker that you can type into to ask!

agent_executor.invoke({"input": input('Message: ')})['output']

LangSmith Tracing

If you look into your lang smith project you will be able to see all the executions and how LangChain thinks in a dashboard. Along with Token usage and the price for each run

langsmith execution tracing

In this case it creates a Prompt to Open AI using the Prompt Template then from the response it knows it need to query the web. From there it get the results and send back to the LLM

It also shows the total cost of my operation.

langsmith execution tracing

Conclusion

This way you could see how simple it was to create an LLM AI Agent that uses the web to produce valuable information. If you want to explore more I recomend seeing all the tutorials and available tools in LangChain and build your own custom AI Agents!

Feel free to message me at X/Twitter !