Writing an LLM Agent in 50 Lines of Code (or no Code at All)

One of the most interesting patterns that is emerging from Large Language Models (LLMs) is the idea of agents. If you’re like me, you can only truly grok a concept by seeing or writing code for it from scratch, so like many other folks, I decided to try building one from scratch.

So what is an agent and why do we need them? Let’s answer the second question first. Most people who are using LLMs, at the moment at least, are doing so through chat interfaces. In a standard chat interface, LLMs have a few limitations:

  • They often hallucinate.
  • They are only trained on data up to their “cut-off” date (currently 2021 for OpenAI’s models).
  • They do not have access to proprietary or private data.
  • They cannot interface well with other tools (without a human in the middle).

Agents attempt to overcome this by:

  1. Prompting models to “summon” or request external tools with a given format (which we’ll explain below).
  2. Running the model in a loop, with some wrapper code that can parse the models tool requests, execute them, and return the input back to the model.
  3. Adding some “chain-of-thought” in the prompting to reduce hallucination.

In essence, the model is deciding what to do, and there is some code that parses out and executes that request on its behalf.

So let’s start by building our agent.

Note that a lot of this code is based on langchain, with tweaks and simplifications. I’ll talk about langchain a bit more below.

The Prompt

First, let’s set up the base prompt.

PROMPT = """
Answer the following questions and obey the following commands as best you can. 

You have access to the following tools:

Search: Search: useful for when you need to answer questions about current events. You should ask targeted questions.
Calculator: Useful for when you need to answer questions about math. Use python code, eg: 2 + 2
Send Email: The input to this should be a comma-separated list: to_email,subject,text
Response To Human: When you need to respond to the human you are talking to.

You will receive a message from the human, then you should start a loop and do one of two things

Option 1: You use a tool to answer the question.
For this, you should use the following format:
Thought: you should always think about what to do
Action: the action to take, should be one of [Search, Calculator, Send Email]
Action Input: the input to the action, to be sent to the tool

After this, the human will respond with an observation, and you will continue.

Option 2: You respond to the human.
For this, you should use the following format:
Action: Response To Human
Action Input: your response to the human, summarizing what you did and what you learned

Begin!
"""

So what is this doing?

  • Telling the model that it will be run in the loop.
  • In that loop, the LLM has two options, it can either “use a tool”, giving that tool an input, or it can respond to the human.
  • We give the model a list of the tools, and a description of when/how to use each one.
  • The thought-action pattern creates a “chain-of-thought”, telling the model to think about what it’s doing.

The Tools

Next, let’s create our tools. Each tool is just a function that takes a string and returns another string.

from langchain.utilities import SerpAPIWrapper
from py_expression_eval import Parser

search_wrapper = SerpAPIWrapper(serpapi_api_key='YOUR_SERP_API_KEY')
def search(str):
    return search_wrapper.run(str)

parser = Parser()
def calculator(str):
    return parser.parse(str).evaluate({})

def send_email(str):
    print("email: " + str)
    to_email, subject, text = str.split(",", 2)
    print(f"Sending email to '{to_email}' with subject '{subject}' and text '{text}'")
    # Don't actually send anything, just print it
    return "Email sent"

We have created three tools here:

  • Search: Conducts a search engine search using a SERP_API_KEY. You can sign-up and generate a key for free for 100 searches / month at https://serpapi.com/ if you want to actually run this.
  • Calculator: I use py_expression_eval as a calculator (good balance between being able to run complex math expressions, without many of the risks of trying to pull in a full Python REPL/eval).
  • Send Email: I’m not brave enough to hook an LLM up to allow it to actually send emails, so we just have a dummy function that pretends to send an email.

That’s all there is to it. You can create any tool using this technique, it just needs to take a string, and output a string.

The Loop

OK, now time for the core.

def agent_loop(initial_command):
    messages = [
        { "role": "system", "content": PROMPT },
        { "role": "user", "content": initial_command },
    ]
    while True:
        response = openai.ChatCompletion.create(
            model="gpt-4",
            messages=messages,
            temperature=0,
            max_tokens=1000,
            top_p=1,)
        response_text = response['choices'][0]['message']['content']
        print(response_text)
        action = _parse_last_line_starting_with(response_text, "Action:")
        action_input = _parse_last_line_starting_with(response_text, "Action Input:")
        if action == "Search":
            tool = search
        elif action == "Calculator":
            tool = calculator
        elif action == "Send Email":
            tool = send_email
        elif action == "Response To Human":
            print(f"Response: {action_input}")
            input("Next message >> ")
        observation = tool(action_input)

        print("Observation: ", observation)
        messages.extend([
            { "role": "system", "content": response_text },
            { "role": "user", "content": f"Observation: {observation}" },

What is this doing? It’s actually pretty simple.

First thing to note is I’m using a conversational model (gpt4). GPT4 is much more powerful than 3.5, and so the agent just behaves better under that model.

We set up the model with a system_prompt that is our PROMPT from above. This gives the model instructions on what we expect it to do.

Next, we set it up in a loop. The first run of the loop passes it the system prompt and our initial message from the user, then we let the model generate text. At that point, it’s our queue for our code to take over control again.

We should now have a message from the model that looks something like:

Thought: I need to find the height of the Great Pyramid of Giza.
Action: Search
Action Input: What is the height of the Great Pyramid of Giza?
Observation:

So now, all we need to do is parse the action and the action input, and use that to decide which tool to call (in this case, “search”) and what parameter to pass (in this case, “What is the height of the Great Pyramid of Giza?”)

Then, we add two messages to our message array:

  1. The original message from the LLM (thought/action/action input)
  2. The response from calling the tool.

And we repeat until we’re ready to respond.

Here it is in action

$ python simple-agent.py "How high is the Great Pyramid in humans? Email the answer to o***e@monarchmoney.com"
Thought: I need to find the height of the Great Pyramid and then convert it to the number of humans. After that, I will email the answer to the given email address.

Action: Search
Action Input: What is the height of the Great Pyramid of Giza in meters?
Observation:  How Tall? At 146.5 m (481 ft) high, the Great Pyramid stood as the tallest structure in the world for more than 4,000 years. Today it stands at 137 m (449.5 ft) high, having lost 9.5 m (31 ft) from the top.
Thought: Now I know the current height of the Great Pyramid is 137 meters. I need to convert this height into the number of humans, assuming an average human height.

Action: Search
Action Input: What is the average height of a human in meters?
Observation:  Average sizes of men and women All figures refer to men and women between the ages of 18 and 25 years. In the US, the average male is 1.77 m tall. The average US-american woman reaches 14 cm less with a height of 1.63 m.
Thought: I have the average heights of men and women. I will use the average of these two values to represent the average human height and calculate the number of humans that would equal the height of the Great Pyramid.

Action: Calculator
Action Input: 137 / ((1.77 + 1.63) / 2)
Observation:  80.58823529411765
Thought: I have calculated the number of humans that would equal the height of the Great Pyramid. Now I will email the answer to the given email address.

Action: Send Email
Action Input: "o***e@monarchmoney.com", "Great Pyramid Height in Humans", "The height of the Great Pyramid of Giza in humans is approximately 80.6, assuming an average human height."
email: "o***e@monarchmoney.com", "Great Pyramid Height in Humans", "The height of the Great Pyramid of Giza in humans is approximately 80.6, assuming an average human height."
Sending email to '"o***e@monarchmoney.com"' with subject ' "Great Pyramid Height in Humans"' and text ' "The height of the Great Pyramid of Giza in humans is approximately 80.6, assuming an average human height."'
Observation:  Email sent
Action: Response To Human
Action Input: I have calculated the height of the Great Pyramid in humans, which is approximately 80.6, and emailed the answer to o***e@monarchmoney.com.
Response: I have calculated the height of the Great Pyramid in humans, which is approximately 80.6, and emailed the answer to o***e@monarchmoney.com.

And if we were to look at the messages object at the end of the run, here is what we would see:

[{'content': '...TRUNCATED_FROM_BASE_PROMPT...'
  'role': 'system'},
 {'content': 'How high is the Great Pyramid in humans? Email the answer to '
             'o***e@monarchmoney.com',
  'role': 'user'},
 {'content': 'Thought: I need to find the height of the Great Pyramid and then '
             'convert it to the number of humans. After that, I will email the '
             'answer to the given email address.\n'
             '\n'
             'Action: Search\n'
             'Action Input: What is the height of the Great Pyramid of Giza in '
             'meters?',
  'role': 'system'},
 {'content': 'Observation: How Tall? At 146.5 m (481 ft) high, the Great '
             'Pyramid stood as the tallest structure in the world for more '
             'than 4,000 years. Today it stands at 137 m (449.5 ft) high, '
             'having lost 9.5 m (31 ft) from the top.',
  'role': 'user'},
 {'content': 'Thought: Now I know the current height of the Great Pyramid is '
             '137 meters. I need to convert this height into the number of '
             'humans, assuming an average human height.\n'
             '\n'
             'Action: Search\n'
             'Action Input: What is the average height of a human in meters?',
  'role': 'system'},
 {'content': 'Observation: Average sizes of men and women All figures refer to '
             'men and women between the ages of 18 and 25 years. In the US, '
             'the average male is 1.77 m tall. The average US-american woman '
             'reaches 14 cm less with a height of 1.63 m.',
  'role': 'user'},
 {'content': 'Thought: I have the average heights of men and women. I will use '
             'the average of these two values to represent the average human '
             'height and calculate the number of humans that would equal the '
             'height of the Great Pyramid.\n'
             '\n'
             'Action: Calculator\n'
             'Action Input: 137 / ((1.77 + 1.63) / 2)',
  'role': 'system'},
 {'content': 'Observation: 80.58823529411765', 'role': 'user'},
 {'content': 'Thought: I have calculated the number of humans that would equal '
             'the height of the Great Pyramid. Now I will email the answer to '
             'the given email address.\n'
             '\n'
             'Action: Send Email\n'
             'Action Input: "o***e@monarchmoney.com", "Great Pyramid Height in '
             'Humans", "The height of the Great Pyramid of Giza in humans is '
             'approximately 80.6, assuming an average human height."',
  'role': 'system'},
 {'content': 'Observation: Email sent', 'role': 'user'}]

Doing It in 0 Lines of Code

What if you didn’t want to write any code at all? Well, if you’re willing to pretend to be code and do things manually, you can just go to https://chat.openai.com, and paste in the base prompt, then, after each Thought/Action/Action Input, do whatever the model is asking you to do (search, calculate, send email), and respond with an “Observation”. Here is a ShareGPT transcript of that in action: https://sharegpt.com/c/lcJFHOO

This obviously wouldn’t be useful in real-life… but it’s a great way to show non-technical folks what an agent is actually doing under the hood.

Learning More

If you want to learn a bit more, a few recommendations would be:

  1. You can go off and read the original ReAct paper, which I think is the canonical source for this pattern at this point (though there have been many other papers with similar patterns).
  2. You can dive into Langchain’s documentation or code to see how it handles/instruments agents. In general, I’ve found Langchain to be a great starting point to observe patterns and prompts, even if further customization entails breaking out of its abstractions.

Agents are really powerful. I’m not going to go into the debate about whether it actually is reasoning or not, but letting an LLM “pretend” to think lets it overcome a lot of its limitations, and unlocks a lot of interesting use-cases (in fact, it does seem like OpenAI’s ChatGPT plugins are operating with some similar type of pattern). It’s pretty clear that within a few months, there will be products and businesses built around interesting custom agents hooked up to all sorts of tools.

In the meantime, hope this was helpful in illustrating how the pattern works.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: