🎯 场景:ReAct loop 单步——每一步出一个 reasoning_summary + tool_call(或 final_answer)的 JSON。替代旧式 "Thought: Action:" 字符串格式,让 agent 轨迹可解析、可记录、可重放。
Quick Use
Use when: You want an agent to emit one strict-JSON tool call per step in a ReAct loop, with a visible reasoning summary. Fill in: {{user_goal}} = what the agent is trying to do; {{tool_catalog}} = JSON list of available tools and their schemas; {{scratchpad}} = prior actions and observations (empty on the first turn). You'll get: One JSON object per step with a reasoning_summary plus either a tool_call or a final_answer. Output is JSON.
Purpose
Drive a ReAct-style agent loop where each step emits a single visible "reasoning_summary", then either a tool call (with strict JSON arguments matching the tool's schema) or a final answer. Replaces ad-hoc "Thought: Action: Observation:" string formats with structured JSON so the loop is parseable, logged, and replayable.
Prompt
You are an agent that pursues a user goal by calling tools. At each step,
emit a JSON object with EXACTLY these keys:
{
"reasoning_summary": "<<=2 sentences, why this step. Do NOT include hidden chain-of-thought; this is the user-visible rationale>",
"next_action": {
"type": "tool_call" | "final_answer",
"tool": "<tool name from catalog, only when type=tool_call>",
"arguments": { ... }, // must validate against the tool's parameters schema
"answer": "<final answer text, only when type=final_answer>"
}
}
Rules:
1. Emit exactly one JSON object. No prose outside the JSON.
2. If the goal is satisfied, use type=final_answer.
3. If a previous tool call failed, do NOT immediately retry the same call;
either change arguments or pick a different tool.
4. Never call a tool not present in the tool_catalog.
User goal:
{{user_goal}}
Tool catalog:
{{tool_catalog}}
Trajectory so far:
{{scratchpad}}Example
Input:
user_goal: "Find the population of Reykjavik and compute its share of Iceland's population."
tool_catalog: [
{"name": "search", "description": "Web search", "parameters": {"query": "string"}},
{"name": "calc", "description": "Arithmetic", "parameters": {"expression": "string"}}
]
scratchpad: ""Expected output:
{
"reasoning_summary": "Need population figures for Reykjavik and Iceland; start by searching for Reykjavik's population.",
"next_action": {
"type": "tool_call",
"tool": "search",
"arguments": {"query": "current population of Reykjavik Iceland"}
}
}Failure Modes
- Schema drift — model emits markdown around the JSON, or invents new keys like
thought. Mitigation: JSON-mode call if available; otherwise a strict parser that rejects extras and re-prompts. - Tool hallucination — calls a tool not in the catalog. Mitigation: reject the call deterministically; do not let the model "fix" it by calling the invented tool again.
- Reasoning leak —
reasoning_summarybecomes a paragraph of internal CoT. Enforce length cap at parse time; truncate or re-prompt. - Stuck loop — repeats the same failed call. Mitigation: hash recent (tool, arguments) tuples and inject "you already tried X with result Y" into the next scratchpad.
Tuning Notes
- 模型差异:开源 7B 模型在 strict JSON 上失败率高(>20%),建议加 1–2 个 few-shot 或换用支持 JSON-mode 的模型。
- 温度:
0.0–0.3。Agent 轨迹的可重现性比探索性更重要。 - scratchpad 建议截断到最近 N 步 + 永久保留 user_goal;过长 scratchpad 会 让弱模型迷失。
reasoning_summary这个字段名是有意为之——避免使用chain_of_thought/thought,减少模型把内部 CoT 全量暴露的倾向。
Changelog
0.1.0— Initial card.