OpenAI's Agent building guide - summary

On Thursday, 17 April 2025, OpenAI published their PDF on building agents: "A Practical Guide to Building Agents" I took some time on the weekend to read through it and see if I can distill some information from here. Here are my take-aways. Workflow: Steps taken towards a goal. Some steps are deterministic and some are non-deterministic. Traditional Software: Digitised, streamlined and deterministic steps that are stitched together by using humans for the non-deterministic steps to complete a workflow. Agents: Independent, non-deterministic steps stitched together by tool calls, which may or may not be deterministic (traditional software or other agents). Examples of non-agent software using machine learning: Translation Transcription Labelling Sentiment analysis Summary generation These are all single steps that humans can execute. Humans would work towards a goal, use these pieces of software to gather information and manipulate the world in order to reach some end goal. Agents can take over some of the non-deterministic interpretation of steps that are currently made by humans. Agents can, in non-deterministic settings: make decisions recognise when a goal is reached recognise when they need to make a course correction halt execution and involve a human Tools: Access to the world beyond the agent. Either to gather context, or to take actions. Examples: internet APIs (weather api) local APIs (computer use) physical APIs (robotic limbs, alarm systems) other agents Potential use-cases for agents: Workflows that previously resisted automation. complex decisions extensive rulesets with costly updates that can be error prone unstructured data - hard for traditional software to interpret. Agentic Cycle This cycle represents the "run loop" that an agent cycles through, evaluating what it knows, determining if it needs more context or needs to take action and comparing its progress with its goal. Agent components Model: LLM driving the independent workflow. Tools: Capabilities available to the agent. Instructions: Guidelines and guardrails. Model selection Start smart, optimise when outcomes can be compared with smartest benchmark. Considerations: local vs hosted context window latency cost Tools Documenting tools thoroughly make them discoverable to agents. Helps the model understand what they can be used for and what they should not be used for. Three types: Data gathering tools Tools to take action in the world Orchestration - tools (agents) to orchestrate other agents. Instructions Clarity: Make instructions as clear as possible: Reduce ambiguity improve decisions smoother workflow execution fewer errors. Strategy: Instruct agent to break down tasks. Be explicit about: which actions to take which tools to use. Tool-use can ensure that deterministic steps are deterministic and avoid hallucinations. Context and edge cases: Preempt and explain known edge-cases. Provide clean instructions for handling them. Leverage existing business documents like customer support scripts, how-to guides and other instruction manuals. Orchestration Agents with a very complex workflow or that have access to a large number of tools are more prone to get confused. Simpler agents with more focussed tools are better: separation of concerns: less "cognitive" load Simpler agents are easier to validate and optimise. More composable agents allow for more dynamic workflows Orchestration is allowing different agents to work together by making them known to each other and providing good documentation about what each agent is good at. There are two types of orchestration: Manager: One agent with access to the other agents as tools. Calls and waits on them as part of its own workflow. Example: "Translate this text into French, Italian and Spanish". Manager will send the request to an agent focussed on each language. Collects the responses and returns the result to the user. Decentralised agents: Agents that know about other agents as tools and can determine that the workflow should be handed over to a specific agent, terminating its involvement in the workflow. Example: Welcome agent can receive customer call, determine the nature of the call and transfer the request to an agent that can take over - agent to handle complaints, agent to handle refunds, etc. Guardrails Guardrails can be implemented as agents that evaluate the output of the agents working on reaching the goal. Layered approach: Start with general guards such as: data privacy risks reputation risks Other types of guards: relevance safety PII moderation tool risks output validation Lastly, guardrails can be added against specific real world cases where previous guards failed. Optimistic execution: Guards can run concurrently with some of the steps and only intervene when their conditio

Apr 20, 2025 - 12:15
 0
OpenAI's Agent building guide - summary

On Thursday, 17 April 2025, OpenAI published their PDF on building agents: "A Practical Guide to Building Agents"

I took some time on the weekend to read through it and see if I can distill some information from here.

Here are my take-aways.

Workflow: Steps taken towards a goal. Some steps are deterministic and some are non-deterministic.

Traditional Software: Digitised, streamlined and deterministic steps that are stitched together by using humans for the non-deterministic steps to complete a workflow.

Agents: Independent, non-deterministic steps stitched together by tool calls, which may or may not be deterministic (traditional software or other agents).

Examples of non-agent software using machine learning:

  • Translation
  • Transcription
  • Labelling
  • Sentiment analysis
  • Summary generation

These are all single steps that humans can execute. Humans would work towards a goal, use these pieces of software to gather information and manipulate the world in order to reach some end goal.

Agents can take over some of the non-deterministic interpretation of steps that are currently made by humans.

Agents can, in non-deterministic settings:

  • make decisions
  • recognise when a goal is reached
  • recognise when they need to make a course correction
  • halt execution and involve a human

Tools: Access to the world beyond the agent. Either to gather context, or to take actions.

Examples:

  • internet APIs (weather api)
  • local APIs (computer use)
  • physical APIs (robotic limbs, alarm systems)
  • other agents

Potential use-cases for agents:
Workflows that previously resisted automation.

  • complex decisions
  • extensive rulesets with costly updates that can be error prone
  • unstructured data - hard for traditional software to interpret.

Agentic Cycle

Agentic Cycle

This cycle represents the "run loop" that an agent cycles through, evaluating what it knows, determining if it needs more context or needs to take action and comparing its progress with its goal.

Agent components

  • Model: LLM driving the independent workflow.
  • Tools: Capabilities available to the agent.
  • Instructions: Guidelines and guardrails.

Model selection

Start smart, optimise when outcomes can be compared with smartest benchmark.
Considerations:

  • local vs hosted
  • context window
  • latency
  • cost

Tools

Documenting tools thoroughly make them discoverable to agents. Helps the model understand what they can be used for and what they should not be used for.

Three types:

  • Data gathering tools
  • Tools to take action in the world
  • Orchestration - tools (agents) to orchestrate other agents.

Instructions

Clarity:
Make instructions as clear as possible:

  • Reduce ambiguity
  • improve decisions
  • smoother workflow execution
  • fewer errors.

Strategy:
Instruct agent to break down tasks.

Be explicit about:

  • which actions to take
  • which tools to use.

Tool-use can ensure that deterministic steps are deterministic and avoid hallucinations.

Context and edge cases:
Preempt and explain known edge-cases.
Provide clean instructions for handling them.
Leverage existing business documents like customer support scripts, how-to guides and other instruction manuals.

Orchestration

Agents with a very complex workflow or that have access to a large number of tools are more prone to get confused.
Simpler agents with more focussed tools are better:

  • separation of concerns: less "cognitive" load
  • Simpler agents are easier to validate and optimise.
  • More composable agents allow for more dynamic workflows

Orchestration is allowing different agents to work together by making them known to each other and providing good documentation about what each agent is good at.

There are two types of orchestration:

Manager: One agent with access to the other agents as tools. Calls and waits on them as part of its own workflow.

Example: "Translate this text into French, Italian and Spanish". Manager will send the request to an agent focussed on each language. Collects the responses and returns the result to the user.

Decentralised agents: Agents that know about other agents as tools and can determine that the workflow should be handed over to a specific agent, terminating its involvement in the workflow.

Example:
Welcome agent can receive customer call, determine the nature of the call and transfer the request to an agent that can take over - agent to handle complaints, agent to handle refunds, etc.

Guardrails

Guardrails can be implemented as agents that evaluate the output of the agents working on reaching the goal.

Layered approach:
Start with general guards such as:

  • data privacy risks
  • reputation risks

Other types of guards:

  • relevance
  • safety
  • PII
  • moderation
  • tool risks
  • output validation

Lastly, guardrails can be added against specific real world cases where previous guards failed.

Optimistic execution: Guards can run concurrently with some of the steps and only intervene when their conditions are breached. At this point it might be necessary for human intervention.

Conclusion

Agents represent a new era where software can reason through ambiguity and navigate complex workflows autonomously.

Foundations of good models:

  • capable models
  • well-defined tools
  • clear instructions

Start simple, with a single agent.
Validate use-case.
Evolve into multi-agent orchestration when it becomes necessary.