4.1 KiB
4.1 KiB
Methodology: Implementing Tool Calls over a Plain Chat API
This document describes a practical pattern for supporting tool calling when the model only exposes a plain chat API.
The core idea is:
- Convert downstream tool definitions into a prompt-level contract.
- Ask the model to emit structured action text.
- Parse that action text in the proxy.
- Re-encode it back into standard protocol fields such as OpenAI
tool_callsor Anthropictool_use.
Core Pattern
When the model does not support native tool calls, do not rely on blindly forwarding tools.
Instead:
- treat the model as a text generator
- define a stable action DSL
- keep the proxy responsible for state, retries, parsing, and protocol mapping
In this project the action DSL is a fenced block:
```json action
{"tool":"NAME","parameters":{"key":"value"}}
## What the Proxy Must Do
The proxy is not a passive transport anymore. Once tool tool calling is enabled, it should:
- inject tool definitions into the prompt
- preserve tool history across turns
- project historical tool calls back into action text
- wrap tool results into a continuation prompt
- detect refusal patterns such as “I don't have tools”
- retry with a stronger instruction when a tool call was expected but missing
- map parsed actions back into downstream protocol fields
## Multi-turn Tool Calling
Single-turn tool calling is not enough. A useful agent loop looks like this:
1. model emits a tool call
2. external executor runs the tool
3. tool result is fed back into the conversation
4. model decides whether to call another tool or finish
To make this stable:
- do not feed tool results back as raw text only
- wrap them in a continuation message that clearly asks for the next action
- keep tool calling active even when later turns do not repeat the original `tools` field
That last point matters. Many clients send `tools` only on the first turn. The proxy should still keep the conversation in tool calling mode when it sees tool history.
## Few-shot Guidance
The minimum few-shot should teach the model the output shape.
A better few-shot also teaches state transitions:
- when to call a tool
- when to wait for the tool result
- when to call another tool
- when to answer normally
For complex agent loops, a multi-step example with:
- user request
- assistant tool call
- tool result
- assistant next action
is usually more effective than a single static action example.
## Retry Guidance
Retry is useful when:
- a tool call was expected but no action block was produced
- the model says tools are unavailable
- the request forces tool usage
A retry prompt should be explicit and procedural, for example:
```text
Your last response did not include any ```json action``` block.
You must respond with at least one valid action block now.
Do not explain. Output the action block directly.
Retries should be bounded. A small retry budget plus stronger instructions per retry is usually enough.
Protocol Mapping
OpenAI side:
- input may contain
tools,tool_choice,assistant.tool_calls, andtool - output should map back into
message.tool_callsandfinish_reason = "tool_calls"
Anthropic side:
- input may contain
tools,tool_choice,tool_use, andtool_result - output should map back into
content[].tool_useandstop_reason = "tool_use"
Common Failure Modes
- only supporting the first tool turn
- losing tool calling state on later turns
- not projecting historical tool calls back into text
- feeding back raw tool results without continuation instructions
- missing refusal detection
- using a parser that is too brittle for real model output
In This Repository
The implementation here follows exactly this pattern:
- downstream tool schemas are rewritten into prompt instructions
- the model emits
json actionblocks - the proxy parses them
- the proxy re-encodes them as OpenAI or Anthropic tool protocol fields
- later turns can continue from tool history even when
toolsare not repeated
Implementation checklist:
- [tool-tool calling-checklist.md](./tool-tool calling-checklist.md)