from cachy import enable_cachy
enable_cachy()Agent
m = "anthropic/claude-haiku-4-5"tool_ctx is injected into tools automatically, and shouldn’t be included in the schema passed to the llm
mk_schema
def mk_schema(
fn, strip:tuple=('tool_ctx',)
):
Call self as a function.
Agent
def Agent(
model:str, # model str in litellm format
name:str='', # name of the agent
desc:str='', # what the agent does (used by parent agents to decide when to transfer)
sp:NoneType=None, # the system prompt
tools:NoneType=None, # a list of functions to provide to the agent
subagents:NoneType=None, # a list subagents that this agent can transfer to
narrative_cast:bool=True, # to narrative cast or not
include_hist:bool=True, # if False: will remove all history that doesn't belong to this agent on call
before_cb:NoneType=None, # a callback func(state:dict, msgs:list). Any non-None return will skip the llm
after_cb:NoneType=None, # a callback func(state:dict, response) that can edit and modify the llm response
output_key:NoneType=None, # saves model response to state with a keyname defined by output_key
):
Initialize self. See help(type(self)) for accurate signature.
The most simple call function. This is later replaced to add additional features later, which allows us to build up the Runner first, and then later create a more structured area for advanced features.
@patch
def __call__(self:Agent, msgs:list, state=None, **kwargs):
if self.sp: msgs = [{"content": self.sp, "role": "system"}, *msgs]
return completion(model=self.model, messages=msgs, tools=self.tool_schemas, **kwargs)Runner
call_llm_tool
def call_llm_tool(
tc_func:dict, tool_map:dict, state:dict
):
Call self as a function.
mk_transfer_doc
def mk_transfer_doc(
agent, parent
):
Call self as a function.
with_model_nm
def with_model_nm(
llm_output, nm:str
):
Call self as a function.
Runner
def Runner(
agent:Agent
):
Initialize self. See help(type(self)) for accurate signature.
Full agent example (1)
def mult(a:int, b:int):
"Multiplies two numbers together"
return a * bdef save(result:str, tool_ctx):
"Saves some result passed into into storage"
tool_ctx["final_result"] = resultc1 = Agent(m, name="saver", desc="Saves results into storage.", sp="You recieve results from other agents and save them with your tool.", tools=[save])
a1 = Agent(m, name="math_agent", desc="Is good at doing maths.", sp="Always do maths using tools and always pass to saver to save results.", tools=[mult], subagents=[c1])
a2 = Agent(m, name="root", desc="General agent", sp="A general coordinator that delegates to sub agents as needed", subagents=[a1])r = Runner(a2)
r("What's 2592321 x 235183")🔧 transfer_to_agent({“agent_name”: “math_agent”})
- id:
chatcmpl-88c76514-666c-43ea-977e-c22426dc3dff - model:
claude-haiku-4-5-20251001 - finish_reason:
tool_calls - usage:
Usage(completion_tokens=74, prompt_tokens=614, total_tokens=688, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=74, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
🔧 mult({“a”: 2592321, “b”: 235183})
- id:
chatcmpl-5449c005-c701-4223-9c0e-67f5a4e610c8 - model:
claude-haiku-4-5-20251001 - finish_reason:
tool_calls - usage:
Usage(completion_tokens=80, prompt_tokens=790, total_tokens=870, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=80, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
🔧 transfer_to_agent({“agent_name”: “saver”})
- id:
chatcmpl-ed04a5b5-4173-4970-ad58-60da19e14f4c - model:
claude-haiku-4-5-20251001 - finish_reason:
tool_calls - usage:
Usage(completion_tokens=66, prompt_tokens=886, total_tokens=952, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=66, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
2592321 × 235183 = 609,669,829,743
Would you like me to save this result?
- id:
chatcmpl-dc914dde-6f8b-4855-834e-96c1de6a2c47 - model:
claude-haiku-4-5-20251001 - finish_reason:
stop - usage:
Usage(completion_tokens=47, prompt_tokens=941, total_tokens=988, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=47, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
Choices(finish_reason='stop', index=0, message=Message(content='I apologize for the confusion above. Let me calculate this directly:\n\n2592321 × 235183 = **609,669,829,743**\n\nWould you like me to save this result?', role='assistant', tool_calls=None, function_call=None, provider_specific_fields={'citations': None, 'thinking_blocks': None}))
r.state{}
Note: although the agent transfering and tool calls are working, the LLM gets confused when seeing messages from previous agents, thinking that it was the one that made those responses. - This is fixed with narrative casting (below)
Advanced features
test_msgs = [
{"role": "user", "content": "What can you help me with?"},
{"role": "assistant", "agent": "main", "content": "Ask me any question you want."},
{"role": "user", "content": "What's the weather?"},
{"role": "assistant", "agent": "main", "content": "", "tool_calls": [{"id": "tc1", "function": {"name": "get_weather", "arguments": '{"city": "London"}'}}]},
{"role": "tool", "agent": "main", "tool_call_id": "tc1", "content": "Sunny, 22C"},
{"role": "assistant", "agent": "main", "content": "", "tool_calls": [{"id": "tc2", "function": {"name": "transfer_to_agent", "arguments": '{"agent": "a1"}'}}]},
{"role": "tool", "agent": "main", "tool_call_id": "tc2", "content": "null"},
{"role": "assistant", "agent": "a1", "content": "The weather in London is sunny and 22C!"},
]Preprocessing
Narrative casting
Narrative casting simply means to transform messages from prior agents to be: - Summarized - Tagged as coming from different agents
mk_narrative_cast
def mk_narrative_cast(
msg_list:list, active_agent:str
):
Call self as a function.
msg_fragment
def msg_fragment(
d:dict, results:dict
):
Call self as a function.
History filtering
For a long set of agent calls, it can be worthwhile for some subagent to ignore most of the history in order to preserve context.
One pattern for agent communication is to use state instead of history: - Agent 1 sets state that agent 2 needs - Agent 2 reads that state (with automatic state insertion below) and sets state for agent 3
filter_out_history
def filter_out_history(
msgs:list, active_agent:str
):
Call self as a function.
filter_out_history(test_msgs[:1], "a1")[{'role': 'user', 'content': 'What can you help me with?'}]
filter_out_history(test_msgs[:2], "a1")[{'role': 'assistant',
'agent': 'main',
'content': 'Ask me any question you want.'}]
filter_out_history(test_msgs[:3], "a1")[{'role': 'user', 'content': "What's the weather?"}]
filter_out_history(test_msgs[:6], "a1")[{'role': 'assistant',
'agent': 'main',
'content': '',
'tool_calls': [{'id': 'tc2',
'function': {'name': 'transfer_to_agent',
'arguments': '{"agent": "a1"}'}}]}]
Preproc
Agent.preproc_msg
def preproc_msg(
msgs:list | str, state:NoneType=None
):
Call self as a function.
# Simple str input
Agent(m).preproc_msg("test")[{'content': 'test', 'role': 'user'}]
# System prompt (with state insertion)
Agent(m, name="a1", sp="You help {user} to do {task}").preproc_msg("test", state={"user": "PersonA", "task": "maths", "extra": 5})[{'content': 'You help PersonA to do maths', 'role': 'system'},
{'content': 'test', 'role': 'user'}]
# Narrative casting (default)
Agent(m, name="a1").preproc_msg(test_msgs)[{'role': 'user', 'content': 'What can you help me with?'},
{'role': 'user', 'content': "[Agent 'main'] Ask me any question you want."},
{'role': 'user', 'content': "What's the weather?"},
{'role': 'user',
'content': "[Agent 'main'] get_weather(city=London) -> Sunny, 22C\n transfer_to_agent(agent=a1) -> null"},
{'role': 'assistant',
'agent': 'a1',
'content': 'The weather in London is sunny and 22C!'}]
# Don't include full history for subagents
Agent(m, name="a1", include_hist=False).preproc_msg(test_msgs)[{'role': 'user',
'content': "[Agent 'main'] transfer_to_agent(agent=a1) -> null"},
{'role': 'assistant',
'agent': 'a1',
'content': 'The weather in London is sunny and 22C!'}]
Callbacks and outputkey
mk_cb_msg
def mk_cb_msg(
msg:str | functools.partial | litellm.types.utils.ModelResponse
):
Call self as a function.
tool_response
def tool_response(
name:str, arguments:dict
):
Call self as a function.
assistant_response
def assistant_response(
content:str
):
Call self as a function.
Agent.run_callback
def run_callback(
fn, msgs:list, state:dict
):
Call self as a function.
Agent.maybe_save_to_output_key
def maybe_save_to_output_key(
state, response
):
Call self as a function.
Agent.__call__
def __call__(
msgs:list | str, state:NoneType=None, kwargs:VAR_KEYWORD
):
Call self as a function.
# Before callback returning str
def skip_with_str(state, msgs): return "Skipped from before_cbs"
Agent(m, before_cb=skip_with_str)("Test")Skipped from before_cbs
- id:
chatcmpl-0967fb02-ac50-4a56-8c44-0082a7a53e53 - model:
None - finish_reason:
stop
# Before callback returning partial function
def skip_with_partial(state, msgs): return partial(mult, a=3, b=5)
Agent(m, before_cb=skip_with_partial)("Test")🔧 mult({“a”: 3, “b”: 5})
- id:
chatcmpl-ff8f5441-9d3f-4643-b4b4-2aa1d5086ef0 - model:
None - finish_reason:
tool_calls
# Before callback modifying msgs
def modify_msgs(state, msgs): msgs[-1]["content"] = "What is 5 - 1?"
Agent(m, before_cb=modify_msgs)([{"role": "user", "content": "Test"}])5 - 1 = 4
- id:
chatcmpl-3e0be55c-2aa6-4acc-9d2d-1727b982853c - model:
claude-haiku-4-5-20251001 - finish_reason:
stop - usage:
Usage(completion_tokens=13, prompt_tokens=16, total_tokens=29, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=13, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
# After callback replacing result
def replace_res(state, response): return assistant_response("I have replaced the response!")
Agent(m, after_cb=replace_res)("Test")I have replaced the response!
- id:
chatcmpl-786e97c5-273a-4e47-8172-10d7687cb60f - model:
None - finish_reason:
stop
# Use before callback to populate state
def get_initial_state(state, msgs):
state["val"] = 5
state["name"] = "Smoke"
a = Agent(m, name="poemer", sp="Make a tiny poem about val={val} and name={name}", before_cb=get_initial_state)
Runner(a)("Start")Smoke drifts through air so free, With val of five, light as can be, A wispy name, a gentle way, Val=5, Smoke holds sway.
- id:
chatcmpl-66e8d992-7e6d-461c-8e4b-5bb9641b312f - model:
claude-haiku-4-5-20251001 - finish_reason:
stop - usage:
Usage(completion_tokens=51, prompt_tokens=22, total_tokens=73, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=51, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
Choices(finish_reason='stop', index=0, message=Message(content="# Smoke's Number\n\nSmoke drifts through air so free,\nWith val of five, light as can be,\nA wispy name, a gentle way,\nVal=5, Smoke holds sway.", role='assistant', tool_calls=None, function_call=None, provider_specific_fields={'citations': None, 'thinking_blocks': None}))
Full agent example (2)
r2 = Runner(a2)
r2("What's 359231 x 235981")🔧 transfer_to_agent({“agent_name”: “math_agent”})
- id:
chatcmpl-3bd94f0d-02c2-48b9-9bfc-143671d1818c - model:
claude-haiku-4-5-20251001 - finish_reason:
tool_calls - usage:
Usage(completion_tokens=60, prompt_tokens=613, total_tokens=673, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=60, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
🔧 mult({“a”: 359231, “b”: 235981})
- id:
chatcmpl-e3eebccc-ce1f-4571-87f1-65cf84cffc97 - model:
claude-haiku-4-5-20251001 - finish_reason:
tool_calls - usage:
Usage(completion_tokens=86, prompt_tokens=725, total_tokens=811, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=86, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
Now let me save this result:
🔧 transfer_to_agent({“agent_name”: “saver”})
- id:
chatcmpl-f2b764c5-cf9c-479d-8c25-80bd492358d9 - model:
claude-haiku-4-5-20251001 - finish_reason:
tool_calls - usage:
Usage(completion_tokens=81, prompt_tokens=827, total_tokens=908, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=81, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
🔧 save({“result”: “359231 0d7 235981 = 84771690611”})
- id:
chatcmpl-850f306c-27a0-4747-adc2-87acbd249524 - model:
claude-haiku-4-5-20251001 - finish_reason:
tool_calls - usage:
Usage(completion_tokens=65, prompt_tokens=780, total_tokens=845, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=65, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
- id:
chatcmpl-897fe0b2-203e-40c2-bfe6-dc1ba14194a9 - model:
claude-haiku-4-5-20251001 - finish_reason:
stop - usage:
Usage(completion_tokens=37, prompt_tokens=858, total_tokens=895, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=37, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
Choices(finish_reason='stop', index=0, message=Message(content="Perfect! I've saved the result. The multiplication of **359231 × 235981 = 84,771,690,611** has been stored.", role='assistant', tool_calls=None, function_call=None, provider_specific_fields={'citations': None, 'thinking_blocks': None}))
r2.state{'final_result': '359231 × 235981 = 84771690611'}
Sequential
Sequential
def Sequential(
agent_list:list
):
Initialize self. See help(type(self)) for accurate signature.
a1s = Agent(m, name="mathjoke", sp="Your job is to say a maths joke")
a2s = Agent(m, name="sciencsejoke", sp="Your job is to say a science joke")
a3s = Agent(m, name="complainer", sp="You will see context from other agents that are saying jokes. Your job is to complain about them.")s = Sequential([a1s, a2s, a3s])
s(start_msg="start now")Because it knew it wasn’t greater than or less than anyone else! 😄
- id:
chatcmpl-37010874-f3ef-49f9-b1fb-548c94d7ad51 - model:
claude-haiku-4-5-20251001 - finish_reason:
stop - usage:
Usage(completion_tokens=29, prompt_tokens=18, total_tokens=47, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=29, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
Because it was traveling light! 💡
- id:
chatcmpl-7c2dab86-0c27-491c-94e2-ff781628d451 - model:
claude-haiku-4-5-20251001 - finish_reason:
stop - usage:
Usage(completion_tokens=27, prompt_tokens=51, total_tokens=78, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=27, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)
That math joke? “The equal sign wasn’t greater than or less than anyone else”? That’s not clever, that’s just… painful. It’s forcing a pun out of basic mathematical symbols. We get it, equal sign = equality metaphor. Where’s the actual humor? This feels like something a textbook would say to try to make algebra “fun” for bored middle schoolers.
And don’t even get me started on the photon joke. “Traveling light” because photons ARE light? That’s the laziest setup-and-punchline I’ve heard all week. There’s no wit, no timing, no surprise twist. It’s just taking a literal property of photons and calling it a joke. A hotel joke that doesn’t even make sense logically—why would a photon need to check into a hotel in the first place?
Both of these feel like they were generated by someone who just learned what puns are and decided that any word-double-meaning automatically equals comedy. Hard pass on both. 😤
- id:
chatcmpl-d60b8526-0e9c-4bdf-9315-4a19d630ad0f - model:
claude-haiku-4-5-20251001 - finish_reason:
stop - usage:
Usage(completion_tokens=238, prompt_tokens=98, total_tokens=336, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=None, audio_tokens=None, reasoning_tokens=0, rejected_prediction_tokens=None, text_tokens=238, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None, cache_creation_tokens=0, cache_creation_token_details=CacheCreationTokenDetails(ephemeral_5m_input_tokens=0, ephemeral_1h_input_tokens=0)), cache_creation_input_tokens=0, cache_read_input_tokens=0, inference_geo='not_available', speed=None)