fix: harden responses streaming and tool-call fallback
Ensure /v1/responses streams always terminate with response.completed and normalize Lingma tool_code fallbacks into structured tool calls, including single-argument forms. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -48,6 +48,8 @@ DEFAULT_ASK_MODE=chat
|
|||||||
|
|
||||||
# 请求侧 tools/tool_choice 透传到 Lingma(默认关闭,开启后可支持工具写文件等场景)
|
# 请求侧 tools/tool_choice 透传到 Lingma(默认关闭,开启后可支持工具写文件等场景)
|
||||||
TOOL_FORWARD_ENABLED=false
|
TOOL_FORWARD_ENABLED=false
|
||||||
|
# 可选:允许透传的工具名白名单,逗号分隔;为空表示不额外限制
|
||||||
|
TOOL_ALLOWLIST=
|
||||||
|
|
||||||
# 专属域(可选)
|
# 专属域(可选)
|
||||||
DEDICATED_DOMAIN_URL=
|
DEDICATED_DOMAIN_URL=
|
||||||
|
|||||||
177
CLAUDE.md
177
CLAUDE.md
@@ -93,3 +93,180 @@ Both protocols share the same backend pool, backpressure guard, stats, and sessi
|
|||||||
- Compose mounts:
|
- Compose mounts:
|
||||||
- `./data -> /app/data` (persistent Lingma binary/cache/workdirs)
|
- `./data -> /app/data` (persistent Lingma binary/cache/workdirs)
|
||||||
- `./secrets -> /secrets:ro` (session bundles, secrets)
|
- `./secrets -> /secrets:ro` (session bundles, secrets)
|
||||||
|
|
||||||
|
|
||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
Behavioral guidelines to reduce common LLM coding mistakes. Merge with project-specific instructions as needed.
|
||||||
|
|
||||||
|
**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment.
|
||||||
|
|
||||||
|
## 1. Think Before Coding
|
||||||
|
|
||||||
|
**Don't assume. Don't hide confusion. Surface tradeoffs.**
|
||||||
|
|
||||||
|
Before implementing:
|
||||||
|
- State your assumptions explicitly. If uncertain, ask.
|
||||||
|
- If multiple interpretations exist, present them - don't pick silently.
|
||||||
|
- If a simpler approach exists, say so. Push back when warranted.
|
||||||
|
- If something is unclear, stop. Name what's confusing. Ask.
|
||||||
|
|
||||||
|
## 2. Simplicity First
|
||||||
|
|
||||||
|
**Minimum code that solves the problem. Nothing speculative.**
|
||||||
|
|
||||||
|
- No features beyond what was asked.
|
||||||
|
- No abstractions for single-use code.
|
||||||
|
- No "flexibility" or "configurability" that wasn't requested.
|
||||||
|
- No error handling for impossible scenarios.
|
||||||
|
- If you write 200 lines and it could be 50, rewrite it.
|
||||||
|
|
||||||
|
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
|
||||||
|
|
||||||
|
## 3. Surgical Changes
|
||||||
|
|
||||||
|
**Touch only what you must. Clean up only your own mess.**
|
||||||
|
|
||||||
|
When editing existing code:
|
||||||
|
- Don't "improve" adjacent code, comments, or formatting.
|
||||||
|
- Don't refactor things that aren't broken.
|
||||||
|
- Match existing style, even if you'd do it differently.
|
||||||
|
- If you notice unrelated dead code, mention it - don't delete it.
|
||||||
|
|
||||||
|
When your changes create orphans:
|
||||||
|
- Remove imports/variables/functions that YOUR changes made unused.
|
||||||
|
- Don't remove pre-existing dead code unless asked.
|
||||||
|
|
||||||
|
The test: Every changed line should trace directly to the user's request.
|
||||||
|
|
||||||
|
## 4. Goal-Driven Execution
|
||||||
|
|
||||||
|
**Define success criteria. Loop until verified.**
|
||||||
|
|
||||||
|
Transform tasks into verifiable goals:
|
||||||
|
- "Add validation" → "Write tests for invalid inputs, then make them pass"
|
||||||
|
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
|
||||||
|
- "Refactor X" → "Ensure tests pass before and after"
|
||||||
|
|
||||||
|
For multi-step tasks, state a brief plan:
|
||||||
|
```
|
||||||
|
1. [Step] → verify: [check]
|
||||||
|
2. [Step] → verify: [check]
|
||||||
|
3. [Step] → verify: [check]
|
||||||
|
```
|
||||||
|
|
||||||
|
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
|
||||||
|
|
||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
Behavioral guidelines to reduce common LLM coding mistakes. Merge with project-specific instructions as needed.
|
||||||
|
|
||||||
|
**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment.
|
||||||
|
|
||||||
|
## 1. Think Before Coding
|
||||||
|
|
||||||
|
**Don't assume. Don't hide confusion. Surface tradeoffs.**
|
||||||
|
|
||||||
|
Before implementing:
|
||||||
|
- State your assumptions explicitly. If uncertain, ask.
|
||||||
|
- If multiple interpretations exist, present them - don't pick silently.
|
||||||
|
- If a simpler approach exists, say so. Push back when warranted.
|
||||||
|
- If something is unclear, stop. Name what's confusing. Ask.
|
||||||
|
|
||||||
|
## 2. Simplicity First
|
||||||
|
|
||||||
|
**Minimum code that solves the problem. Nothing speculative.**
|
||||||
|
|
||||||
|
- No features beyond what was asked.
|
||||||
|
- No abstractions for single-use code.
|
||||||
|
- No "flexibility" or "configurability" that wasn't requested.
|
||||||
|
- No error handling for impossible scenarios.
|
||||||
|
- If you write 200 lines and it could be 50, rewrite it.
|
||||||
|
|
||||||
|
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
|
||||||
|
|
||||||
|
## 3. Surgical Changes
|
||||||
|
|
||||||
|
**Touch only what you must. Clean up only your own mess.**
|
||||||
|
|
||||||
|
When editing existing code:
|
||||||
|
- Don't "improve" adjacent code, comments, or formatting.
|
||||||
|
- Don't refactor things that aren't broken.
|
||||||
|
- Match existing style, even if you'd do it differently.
|
||||||
|
- If you notice unrelated dead code, mention it - don't delete it.
|
||||||
|
|
||||||
|
When your changes create orphans:
|
||||||
|
- Remove imports/variables/functions that YOUR changes made unused.
|
||||||
|
- Don't remove pre-existing dead code unless asked.
|
||||||
|
|
||||||
|
The test: Every changed line should trace directly to the user's request.
|
||||||
|
|
||||||
|
## 4. Goal-Driven Execution
|
||||||
|
|
||||||
|
**Define success criteria. Loop until verified.**
|
||||||
|
|
||||||
|
Transform tasks into verifiable goals:
|
||||||
|
- "Add validation" → "Write tests for invalid inputs, then make them pass"
|
||||||
|
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
|
||||||
|
- "Refactor X" → "Ensure tests pass before and after"
|
||||||
|
|
||||||
|
For multi-step tasks, state a brief plan:
|
||||||
|
```
|
||||||
|
1. [Step] → verify: [check]
|
||||||
|
2. [Step] → verify: [check]
|
||||||
|
3. [Step] → verify: [check]
|
||||||
|
```
|
||||||
|
|
||||||
|
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
|
||||||
|
|
||||||
|
<!-- gitnexus:start -->
|
||||||
|
# GitNexus — Code Intelligence
|
||||||
|
|
||||||
|
This project is indexed by GitNexus as **lingma-openai-gateway** (1093 symbols, 2685 relationships, 97 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely.
|
||||||
|
|
||||||
|
> If any GitNexus tool warns the index is stale, run `npx gitnexus analyze` in terminal first.
|
||||||
|
|
||||||
|
## Always Do
|
||||||
|
|
||||||
|
- **MUST run impact analysis before editing any symbol.** Before modifying a function, class, or method, run `gitnexus_impact({target: "symbolName", direction: "upstream"})` and report the blast radius (direct callers, affected processes, risk level) to the user.
|
||||||
|
- **MUST run `gitnexus_detect_changes()` before committing** to verify your changes only affect expected symbols and execution flows.
|
||||||
|
- **MUST warn the user** if impact analysis returns HIGH or CRITICAL risk before proceeding with edits.
|
||||||
|
- When exploring unfamiliar code, use `gitnexus_query({query: "concept"})` to find execution flows instead of grepping. It returns process-grouped results ranked by relevance.
|
||||||
|
- When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use `gitnexus_context({name: "symbolName"})`.
|
||||||
|
|
||||||
|
## Never Do
|
||||||
|
|
||||||
|
- NEVER edit a function, class, or method without first running `gitnexus_impact` on it.
|
||||||
|
- NEVER ignore HIGH or CRITICAL risk warnings from impact analysis.
|
||||||
|
- NEVER rename symbols with find-and-replace — use `gitnexus_rename` which understands the call graph.
|
||||||
|
- NEVER commit changes without running `gitnexus_detect_changes()` to check affected scope.
|
||||||
|
|
||||||
|
## Resources
|
||||||
|
|
||||||
|
| Resource | Use for |
|
||||||
|
|----------|---------|
|
||||||
|
| `gitnexus://repo/lingma-openai-gateway/context` | Codebase overview, check index freshness |
|
||||||
|
| `gitnexus://repo/lingma-openai-gateway/clusters` | All functional areas |
|
||||||
|
| `gitnexus://repo/lingma-openai-gateway/processes` | All execution flows |
|
||||||
|
| `gitnexus://repo/lingma-openai-gateway/process/{name}` | Step-by-step execution trace |
|
||||||
|
|
||||||
|
## CLI
|
||||||
|
|
||||||
|
| Task | Read this skill file |
|
||||||
|
|------|---------------------|
|
||||||
|
| Understand architecture / "How does X work?" | `.claude/skills/gitnexus/gitnexus-exploring/SKILL.md` |
|
||||||
|
| Blast radius / "What breaks if I change X?" | `.claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md` |
|
||||||
|
| Trace bugs / "Why is X failing?" | `.claude/skills/gitnexus/gitnexus-debugging/SKILL.md` |
|
||||||
|
| Rename / extract / split / refactor | `.claude/skills/gitnexus/gitnexus-refactoring/SKILL.md` |
|
||||||
|
| Tools, resources, schema reference | `.claude/skills/gitnexus/gitnexus-guide/SKILL.md` |
|
||||||
|
| Index, status, clean, wiki CLI commands | `.claude/skills/gitnexus/gitnexus-cli/SKILL.md` |
|
||||||
|
|
||||||
|
<!-- gitnexus:end -->
|
||||||
|
|||||||
@@ -5,6 +5,11 @@ import os
|
|||||||
from dataclasses import dataclass, field
|
from dataclasses import dataclass, field
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
def _csv_env(raw: str) -> list[str]:
|
||||||
|
return [item.strip() for item in (raw or "").replace("\n", ",").split(",") if item.strip()]
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class LingmaAccount:
|
class LingmaAccount:
|
||||||
username: str
|
username: str
|
||||||
@@ -45,6 +50,7 @@ class Settings:
|
|||||||
session_cache_max_entries: int = 256
|
session_cache_max_entries: int = 256
|
||||||
session_cache_ttl_sec: float = 1800.0
|
session_cache_ttl_sec: float = 1800.0
|
||||||
tool_forward_enabled: bool = False
|
tool_forward_enabled: bool = False
|
||||||
|
tool_allowlist: list[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
def _bool_env(name: str, default: bool) -> bool:
|
def _bool_env(name: str, default: bool) -> bool:
|
||||||
@@ -177,4 +183,5 @@ def load_settings() -> Settings:
|
|||||||
session_cache_max_entries=int(os.getenv("SESSION_CACHE_MAX_ENTRIES", "256")),
|
session_cache_max_entries=int(os.getenv("SESSION_CACHE_MAX_ENTRIES", "256")),
|
||||||
session_cache_ttl_sec=float(os.getenv("SESSION_CACHE_TTL_SEC", "1800")),
|
session_cache_ttl_sec=float(os.getenv("SESSION_CACHE_TTL_SEC", "1800")),
|
||||||
tool_forward_enabled=_bool_env("TOOL_FORWARD_ENABLED", False),
|
tool_forward_enabled=_bool_env("TOOL_FORWARD_ENABLED", False),
|
||||||
|
tool_allowlist=_csv_env(os.getenv("TOOL_ALLOWLIST", "")),
|
||||||
)
|
)
|
||||||
|
|||||||
371
app/main.py
371
app/main.py
@@ -1,5 +1,6 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import ast
|
||||||
import asyncio
|
import asyncio
|
||||||
import hashlib
|
import hashlib
|
||||||
import json
|
import json
|
||||||
@@ -358,6 +359,68 @@ def _include_usage(stream_options: dict | None) -> bool:
|
|||||||
return bool(stream_options.get("include_usage"))
|
return bool(stream_options.get("include_usage"))
|
||||||
|
|
||||||
|
|
||||||
|
def _tool_allowlist() -> set[str]:
|
||||||
|
return {name.strip() for name in settings.tool_allowlist if isinstance(name, str) and name.strip()}
|
||||||
|
|
||||||
|
|
||||||
|
def _openai_tool_name(tool: Any) -> str | None:
|
||||||
|
if not isinstance(tool, dict):
|
||||||
|
return None
|
||||||
|
if tool.get("type") == "function":
|
||||||
|
fn = tool.get("function")
|
||||||
|
if isinstance(fn, dict):
|
||||||
|
name = fn.get("name")
|
||||||
|
if isinstance(name, str) and name.strip():
|
||||||
|
return name.strip()
|
||||||
|
name = tool.get("name")
|
||||||
|
if isinstance(name, str) and name.strip():
|
||||||
|
return name.strip()
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _anthropic_tool_name(tool: Any) -> str | None:
|
||||||
|
if not isinstance(tool, dict):
|
||||||
|
return None
|
||||||
|
name = tool.get("name")
|
||||||
|
if isinstance(name, str) and name.strip():
|
||||||
|
return name.strip()
|
||||||
|
fn = tool.get("function")
|
||||||
|
if isinstance(fn, dict):
|
||||||
|
nested_name = fn.get("name")
|
||||||
|
if isinstance(nested_name, str) and nested_name.strip():
|
||||||
|
return nested_name.strip()
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _filter_allowed_tools(tools: list[dict[str, Any]], *, provider: str) -> list[dict[str, Any]]:
|
||||||
|
allowlist = _tool_allowlist()
|
||||||
|
if not allowlist:
|
||||||
|
return tools
|
||||||
|
name_fn = _openai_tool_name if provider == "openai" else _anthropic_tool_name
|
||||||
|
return [tool for tool in tools if (name := name_fn(tool)) and name in allowlist]
|
||||||
|
|
||||||
|
|
||||||
|
def _ensure_tool_choice_allowed(tool_choice: Any, *, provider: str) -> None:
|
||||||
|
allowlist = _tool_allowlist()
|
||||||
|
if not allowlist:
|
||||||
|
return
|
||||||
|
forced_name = (
|
||||||
|
_openai_forced_tool_name(tool_choice)
|
||||||
|
if provider == "openai"
|
||||||
|
else _anthropic_forced_tool_name(tool_choice)
|
||||||
|
)
|
||||||
|
if forced_name and forced_name not in allowlist:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=400,
|
||||||
|
detail={
|
||||||
|
"error": {
|
||||||
|
"type": "invalid_request_error",
|
||||||
|
"message": f"tool '{forced_name}' is not allowed",
|
||||||
|
}
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def _openai_tool_config(req: ChatCompletionsRequest) -> dict[str, Any] | None:
|
def _openai_tool_config(req: ChatCompletionsRequest) -> dict[str, Any] | None:
|
||||||
if not settings.tool_forward_enabled:
|
if not settings.tool_forward_enabled:
|
||||||
return None
|
return None
|
||||||
@@ -365,9 +428,11 @@ def _openai_tool_config(req: ChatCompletionsRequest) -> dict[str, Any] | None:
|
|||||||
has_choice = req.tool_choice is not None
|
has_choice = req.tool_choice is not None
|
||||||
if not has_tools and not has_choice:
|
if not has_tools and not has_choice:
|
||||||
return None
|
return None
|
||||||
|
_ensure_tool_choice_allowed(req.tool_choice, provider="openai")
|
||||||
|
tools = _filter_allowed_tools(req.tools or [], provider="openai")
|
||||||
return {
|
return {
|
||||||
"provider": "openai",
|
"provider": "openai",
|
||||||
"tools": req.tools or [],
|
"tools": tools,
|
||||||
"tool_choice": req.tool_choice,
|
"tool_choice": req.tool_choice,
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -379,9 +444,11 @@ def _anthropic_tool_config(req: AnthropicMessagesRequest) -> dict[str, Any] | No
|
|||||||
has_choice = req.tool_choice is not None
|
has_choice = req.tool_choice is not None
|
||||||
if not has_tools and not has_choice:
|
if not has_tools and not has_choice:
|
||||||
return None
|
return None
|
||||||
|
_ensure_tool_choice_allowed(req.tool_choice, provider="anthropic")
|
||||||
|
tools = _filter_allowed_tools(req.tools or [], provider="anthropic")
|
||||||
return {
|
return {
|
||||||
"provider": "anthropic",
|
"provider": "anthropic",
|
||||||
"tools": req.tools or [],
|
"tools": tools,
|
||||||
"tool_choice": req.tool_choice,
|
"tool_choice": req.tool_choice,
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -537,8 +604,85 @@ def _json_object_from_text(text: str) -> dict[str, Any] | None:
|
|||||||
return parsed if isinstance(parsed, dict) else None
|
return parsed if isinstance(parsed, dict) else None
|
||||||
|
|
||||||
|
|
||||||
def _forced_tool_event_from_text(text: str, forced_tool_name: str) -> dict[str, Any] | None:
|
def _tool_code_single_arg_name(tools: list[dict[str, Any]] | None, forced_tool_name: str) -> str | None:
|
||||||
|
if not isinstance(tools, list):
|
||||||
|
return None
|
||||||
|
for tool in tools:
|
||||||
|
if not isinstance(tool, dict):
|
||||||
|
continue
|
||||||
|
schema: dict[str, Any] | None = None
|
||||||
|
if tool.get("type") == "function":
|
||||||
|
fn = tool.get("function")
|
||||||
|
if isinstance(fn, dict) and fn.get("name") == forced_tool_name:
|
||||||
|
params = fn.get("parameters")
|
||||||
|
if isinstance(params, dict):
|
||||||
|
schema = params
|
||||||
|
elif tool.get("name") == forced_tool_name:
|
||||||
|
input_schema = tool.get("input_schema")
|
||||||
|
if isinstance(input_schema, dict):
|
||||||
|
schema = input_schema
|
||||||
|
if not isinstance(schema, dict):
|
||||||
|
continue
|
||||||
|
properties = schema.get("properties")
|
||||||
|
if not isinstance(properties, dict) or len(properties) != 1:
|
||||||
|
return None
|
||||||
|
only_name = next(iter(properties.keys()), None)
|
||||||
|
if isinstance(only_name, str) and only_name.strip():
|
||||||
|
return only_name
|
||||||
|
return None
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _tool_code_object_from_text(
|
||||||
|
text: str,
|
||||||
|
forced_tool_name: str,
|
||||||
|
*,
|
||||||
|
single_arg_name: str | None = None,
|
||||||
|
) -> dict[str, Any] | None:
|
||||||
|
raw = text.strip()
|
||||||
|
if not raw.startswith("```tool_code") or not raw.endswith("```"):
|
||||||
|
return None
|
||||||
|
lines = raw.splitlines()
|
||||||
|
if len(lines) < 2:
|
||||||
|
return None
|
||||||
|
body = "\n".join(lines[1:-1]).strip()
|
||||||
|
try:
|
||||||
|
parsed = ast.parse(body, mode="eval")
|
||||||
|
except Exception:
|
||||||
|
return None
|
||||||
|
call = parsed.body
|
||||||
|
if not isinstance(call, ast.Call):
|
||||||
|
return None
|
||||||
|
if not isinstance(call.func, ast.Name) or call.func.id != forced_tool_name:
|
||||||
|
return None
|
||||||
|
arguments: dict[str, Any] = {}
|
||||||
|
if call.args:
|
||||||
|
if len(call.args) != 1 or call.keywords or not single_arg_name:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
arguments[single_arg_name] = ast.literal_eval(call.args[0])
|
||||||
|
except Exception:
|
||||||
|
return None
|
||||||
|
return {"arguments": arguments}
|
||||||
|
for kw in call.keywords:
|
||||||
|
if kw.arg is None:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
arguments[kw.arg] = ast.literal_eval(kw.value)
|
||||||
|
except Exception:
|
||||||
|
return None
|
||||||
|
return {"arguments": arguments}
|
||||||
|
|
||||||
|
|
||||||
|
def _forced_tool_event_from_text(
|
||||||
|
text: str,
|
||||||
|
forced_tool_name: str,
|
||||||
|
*,
|
||||||
|
single_arg_name: str | None = None,
|
||||||
|
) -> dict[str, Any] | None:
|
||||||
parsed = _json_object_from_text(text)
|
parsed = _json_object_from_text(text)
|
||||||
|
if parsed is None:
|
||||||
|
parsed = _tool_code_object_from_text(text, forced_tool_name, single_arg_name=single_arg_name)
|
||||||
if parsed is None:
|
if parsed is None:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
@@ -756,11 +900,14 @@ async def v1_chat_completions(req: ChatCompletionsRequest, request: Request):
|
|||||||
completion_id = f"chatcmpl-{uuid.uuid4().hex}"
|
completion_id = f"chatcmpl-{uuid.uuid4().hex}"
|
||||||
completion_tokens_holder = {"n": 0}
|
completion_tokens_holder = {"n": 0}
|
||||||
stream_meta: dict = {}
|
stream_meta: dict = {}
|
||||||
|
forced_tool_name = _openai_forced_tool_name(req.tool_choice)
|
||||||
|
forced_tool_single_arg_name = _tool_code_single_arg_name(req.tools, forced_tool_name) if forced_tool_name else None
|
||||||
|
|
||||||
async def event_stream(_ticket=ticket, _inst=inst, _meta=stream_meta):
|
async def event_stream(_ticket=ticket, _inst=inst, _meta=stream_meta):
|
||||||
success = False
|
success = False
|
||||||
tool_call_indexes: dict[str, int] = {}
|
tool_call_indexes: dict[str, int] = {}
|
||||||
saw_tool_call = False
|
saw_tool_call = False
|
||||||
|
buffered_text_parts: list[str] = []
|
||||||
try:
|
try:
|
||||||
async for chunk in _inst.client.chat_stream(
|
async for chunk in _inst.client.chat_stream(
|
||||||
prompt,
|
prompt,
|
||||||
@@ -809,6 +956,7 @@ async def v1_chat_completions(req: ChatCompletionsRequest, request: Request):
|
|||||||
text = _stream_text(chunk)
|
text = _stream_text(chunk)
|
||||||
if not text:
|
if not text:
|
||||||
continue
|
continue
|
||||||
|
buffered_text_parts.append(text)
|
||||||
completion_tokens_holder["n"] += estimate_tokens(text)
|
completion_tokens_holder["n"] += estimate_tokens(text)
|
||||||
payload = {
|
payload = {
|
||||||
"id": completion_id,
|
"id": completion_id,
|
||||||
@@ -825,6 +973,39 @@ async def v1_chat_completions(req: ChatCompletionsRequest, request: Request):
|
|||||||
}
|
}
|
||||||
yield f"data: {json.dumps(payload, ensure_ascii=False)}\n\n"
|
yield f"data: {json.dumps(payload, ensure_ascii=False)}\n\n"
|
||||||
|
|
||||||
|
if not saw_tool_call and forced_tool_name:
|
||||||
|
fallback_event = _forced_tool_event_from_text(
|
||||||
|
"".join(buffered_text_parts),
|
||||||
|
forced_tool_name,
|
||||||
|
single_arg_name=forced_tool_single_arg_name,
|
||||||
|
)
|
||||||
|
if fallback_event is not None:
|
||||||
|
saw_tool_call = True
|
||||||
|
tool_id = "call_fallback_0"
|
||||||
|
idx = 0
|
||||||
|
tool_call_indexes[tool_id] = idx
|
||||||
|
payload = {
|
||||||
|
"id": completion_id,
|
||||||
|
"object": "chat.completion.chunk",
|
||||||
|
"created": created,
|
||||||
|
"model": model,
|
||||||
|
"choices": [
|
||||||
|
{
|
||||||
|
"index": 0,
|
||||||
|
"delta": {
|
||||||
|
"tool_calls": [
|
||||||
|
{
|
||||||
|
"index": idx,
|
||||||
|
**_openai_tool_call(fallback_event, forced_id=tool_id),
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"finish_reason": None,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
}
|
||||||
|
yield f"data: {json.dumps(payload, ensure_ascii=False)}\n\n"
|
||||||
|
|
||||||
done_payload = {
|
done_payload = {
|
||||||
"id": completion_id,
|
"id": completion_id,
|
||||||
"object": "chat.completion.chunk",
|
"object": "chat.completion.chunk",
|
||||||
@@ -945,7 +1126,11 @@ async def v1_chat_completions(req: ChatCompletionsRequest, request: Request):
|
|||||||
if not saw_tool_call:
|
if not saw_tool_call:
|
||||||
forced_tool_name = _openai_forced_tool_name(req.tool_choice)
|
forced_tool_name = _openai_forced_tool_name(req.tool_choice)
|
||||||
if forced_tool_name:
|
if forced_tool_name:
|
||||||
fallback_event = _forced_tool_event_from_text(message_content, forced_tool_name)
|
fallback_event = _forced_tool_event_from_text(
|
||||||
|
message_content,
|
||||||
|
forced_tool_name,
|
||||||
|
single_arg_name=_tool_code_single_arg_name(req.tools, forced_tool_name),
|
||||||
|
)
|
||||||
if fallback_event is not None:
|
if fallback_event is not None:
|
||||||
tool_calls.append(_openai_tool_call(fallback_event, forced_id="call_fallback_0"))
|
tool_calls.append(_openai_tool_call(fallback_event, forced_id="call_fallback_0"))
|
||||||
saw_tool_call = True
|
saw_tool_call = True
|
||||||
@@ -1169,6 +1354,42 @@ async def _responses_stream_from_chat_stream(
|
|||||||
created_at = int(time.time())
|
created_at = int(time.time())
|
||||||
usage: dict[str, int] = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
|
usage: dict[str, int] = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
|
||||||
completed_sent = False
|
completed_sent = False
|
||||||
|
output_item_id = f"msg_{uuid.uuid4().hex}"
|
||||||
|
output_index = 0
|
||||||
|
content_index = 0
|
||||||
|
output_text_parts: list[str] = []
|
||||||
|
function_call_items: list[dict[str, Any]] = []
|
||||||
|
function_call_index_by_id: dict[str, int] = {}
|
||||||
|
function_call_arguments_by_id: dict[str, str] = {}
|
||||||
|
function_call_name_by_id: dict[str, str] = {}
|
||||||
|
function_call_id_by_upstream_index: dict[int, str] = {}
|
||||||
|
|
||||||
|
def _message_item(status: str) -> dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"id": output_item_id,
|
||||||
|
"type": "message",
|
||||||
|
"role": "assistant",
|
||||||
|
"status": status,
|
||||||
|
"content": [
|
||||||
|
{
|
||||||
|
"type": "output_text",
|
||||||
|
"text": "".join(output_text_parts),
|
||||||
|
}
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
def _function_call_item(call_id: str, *, status: str, name: str, arguments: str) -> dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"id": call_id,
|
||||||
|
"type": "function_call",
|
||||||
|
"call_id": call_id,
|
||||||
|
"name": name,
|
||||||
|
"arguments": arguments,
|
||||||
|
"status": status,
|
||||||
|
}
|
||||||
|
|
||||||
|
def _completed_output_items() -> list[dict[str, Any]]:
|
||||||
|
return [_message_item("completed"), *function_call_items]
|
||||||
|
|
||||||
def _completed_frame() -> str:
|
def _completed_frame() -> str:
|
||||||
return _sse_data(
|
return _sse_data(
|
||||||
@@ -1180,11 +1401,94 @@ async def _responses_stream_from_chat_stream(
|
|||||||
"created_at": created_at,
|
"created_at": created_at,
|
||||||
"status": "completed",
|
"status": "completed",
|
||||||
"model": model,
|
"model": model,
|
||||||
|
"output": _completed_output_items(),
|
||||||
"usage": usage,
|
"usage": usage,
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
|
||||||
|
def _finish_output_item_frames() -> list[str]:
|
||||||
|
frames = [
|
||||||
|
_sse_data(
|
||||||
|
{
|
||||||
|
"type": "response.output_text.done",
|
||||||
|
"response_id": response_id,
|
||||||
|
"item_id": output_item_id,
|
||||||
|
"output_index": output_index,
|
||||||
|
"content_index": content_index,
|
||||||
|
"text": "".join(output_text_parts),
|
||||||
|
}
|
||||||
|
),
|
||||||
|
_sse_data(
|
||||||
|
{
|
||||||
|
"type": "response.output_item.done",
|
||||||
|
"response_id": response_id,
|
||||||
|
"output_index": output_index,
|
||||||
|
"item": _message_item("completed"),
|
||||||
|
}
|
||||||
|
),
|
||||||
|
]
|
||||||
|
for idx, item in enumerate(function_call_items, start=1):
|
||||||
|
frames.append(
|
||||||
|
_sse_data(
|
||||||
|
{
|
||||||
|
"type": "response.function_call_arguments.done",
|
||||||
|
"response_id": response_id,
|
||||||
|
"item_id": item["id"],
|
||||||
|
"output_index": idx,
|
||||||
|
"arguments": item["arguments"],
|
||||||
|
}
|
||||||
|
)
|
||||||
|
)
|
||||||
|
frames.append(
|
||||||
|
_sse_data(
|
||||||
|
{
|
||||||
|
"type": "response.output_item.done",
|
||||||
|
"response_id": response_id,
|
||||||
|
"output_index": idx,
|
||||||
|
"item": item,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
)
|
||||||
|
return frames
|
||||||
|
|
||||||
|
def _ensure_function_call_item(call_id: str) -> list[str]:
|
||||||
|
existing_index = function_call_index_by_id.get(call_id)
|
||||||
|
name = function_call_name_by_id.get(call_id, "tool")
|
||||||
|
arguments = function_call_arguments_by_id.get(call_id, "")
|
||||||
|
if existing_index is not None:
|
||||||
|
function_call_items[existing_index] = _function_call_item(
|
||||||
|
call_id,
|
||||||
|
status="completed",
|
||||||
|
name=name,
|
||||||
|
arguments=arguments,
|
||||||
|
)
|
||||||
|
return []
|
||||||
|
item = _function_call_item(
|
||||||
|
call_id,
|
||||||
|
status="completed",
|
||||||
|
name=name,
|
||||||
|
arguments=arguments,
|
||||||
|
)
|
||||||
|
function_call_items.append(item)
|
||||||
|
item_index = len(function_call_items) - 1
|
||||||
|
function_call_index_by_id[call_id] = item_index
|
||||||
|
return [
|
||||||
|
_sse_data(
|
||||||
|
{
|
||||||
|
"type": "response.output_item.added",
|
||||||
|
"response_id": response_id,
|
||||||
|
"output_index": item_index + 1,
|
||||||
|
"item": _function_call_item(
|
||||||
|
call_id,
|
||||||
|
status="in_progress",
|
||||||
|
name=name,
|
||||||
|
arguments="",
|
||||||
|
),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
]
|
||||||
|
|
||||||
yield _sse_data(
|
yield _sse_data(
|
||||||
{
|
{
|
||||||
"type": "response.created",
|
"type": "response.created",
|
||||||
@@ -1194,9 +1498,18 @@ async def _responses_stream_from_chat_stream(
|
|||||||
"created_at": created_at,
|
"created_at": created_at,
|
||||||
"status": "in_progress",
|
"status": "in_progress",
|
||||||
"model": model,
|
"model": model,
|
||||||
|
"output": [],
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
yield _sse_data(
|
||||||
|
{
|
||||||
|
"type": "response.output_item.added",
|
||||||
|
"response_id": response_id,
|
||||||
|
"output_index": output_index,
|
||||||
|
"item": _message_item("in_progress"),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
async for part in chat_stream.body_iterator:
|
async for part in chat_stream.body_iterator:
|
||||||
@@ -1207,6 +1520,8 @@ async def _responses_stream_from_chat_stream(
|
|||||||
continue
|
continue
|
||||||
body = frame[len("data:") :].strip()
|
body = frame[len("data:") :].strip()
|
||||||
if body == "[DONE]":
|
if body == "[DONE]":
|
||||||
|
for event in _finish_output_item_frames():
|
||||||
|
yield event
|
||||||
yield _completed_frame()
|
yield _completed_frame()
|
||||||
yield "data: [DONE]\n\n"
|
yield "data: [DONE]\n\n"
|
||||||
completed_sent = True
|
completed_sent = True
|
||||||
@@ -1229,10 +1544,14 @@ async def _responses_stream_from_chat_stream(
|
|||||||
|
|
||||||
text = delta.get("content")
|
text = delta.get("content")
|
||||||
if isinstance(text, str) and text:
|
if isinstance(text, str) and text:
|
||||||
|
output_text_parts.append(text)
|
||||||
yield _sse_data(
|
yield _sse_data(
|
||||||
{
|
{
|
||||||
"type": "response.output_text.delta",
|
"type": "response.output_text.delta",
|
||||||
"response_id": response_id,
|
"response_id": response_id,
|
||||||
|
"item_id": output_item_id,
|
||||||
|
"output_index": output_index,
|
||||||
|
"content_index": content_index,
|
||||||
"delta": text,
|
"delta": text,
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
@@ -1243,35 +1562,59 @@ async def _responses_stream_from_chat_stream(
|
|||||||
if not isinstance(tool_call, dict):
|
if not isinstance(tool_call, dict):
|
||||||
continue
|
continue
|
||||||
fn = tool_call.get("function") if isinstance(tool_call.get("function"), dict) else {}
|
fn = tool_call.get("function") if isinstance(tool_call.get("function"), dict) else {}
|
||||||
call_id = str(tool_call.get("id") or f"call_{idx}")
|
upstream_index_raw = tool_call.get("index")
|
||||||
|
upstream_index = upstream_index_raw if isinstance(upstream_index_raw, int) else idx
|
||||||
|
call_id = str(
|
||||||
|
tool_call.get("id")
|
||||||
|
or function_call_id_by_upstream_index.get(upstream_index)
|
||||||
|
or f"call_{upstream_index}"
|
||||||
|
)
|
||||||
|
function_call_id_by_upstream_index[upstream_index] = call_id
|
||||||
|
name = str(fn.get("name") or function_call_name_by_id.get(call_id) or "tool")
|
||||||
|
function_call_name_by_id[call_id] = name
|
||||||
|
arguments_delta = str(fn.get("arguments") or "")
|
||||||
|
accumulated_arguments = (
|
||||||
|
function_call_arguments_by_id.get(call_id, "") + arguments_delta
|
||||||
|
)
|
||||||
|
function_call_arguments_by_id[call_id] = accumulated_arguments
|
||||||
|
for event in _ensure_function_call_item(call_id):
|
||||||
|
yield event
|
||||||
|
if arguments_delta:
|
||||||
yield _sse_data(
|
yield _sse_data(
|
||||||
{
|
{
|
||||||
"type": "response.function_call.delta",
|
"type": "response.function_call_arguments.delta",
|
||||||
"response_id": response_id,
|
"response_id": response_id,
|
||||||
"item_id": call_id,
|
"item_id": call_id,
|
||||||
"name": str(fn.get("name") or "tool"),
|
"output_index": function_call_index_by_id[call_id] + 1,
|
||||||
"arguments": str(fn.get("arguments") or "{}"),
|
"delta": arguments_delta,
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
except asyncio.CancelledError:
|
except asyncio.CancelledError:
|
||||||
if not completed_sent:
|
if not completed_sent:
|
||||||
|
for event in _finish_output_item_frames():
|
||||||
|
yield event
|
||||||
yield _completed_frame()
|
yield _completed_frame()
|
||||||
yield "data: [DONE]\n\n"
|
yield "data: [DONE]\n\n"
|
||||||
completed_sent = True
|
completed_sent = True
|
||||||
return
|
return
|
||||||
except Exception:
|
except Exception:
|
||||||
if not completed_sent:
|
if not completed_sent:
|
||||||
|
for event in _finish_output_item_frames():
|
||||||
|
yield event
|
||||||
yield _completed_frame()
|
yield _completed_frame()
|
||||||
yield "data: [DONE]\n\n"
|
yield "data: [DONE]\n\n"
|
||||||
completed_sent = True
|
completed_sent = True
|
||||||
return
|
return
|
||||||
|
|
||||||
if not completed_sent:
|
if not completed_sent:
|
||||||
|
for event in _finish_output_item_frames():
|
||||||
|
yield event
|
||||||
yield _completed_frame()
|
yield _completed_frame()
|
||||||
yield "data: [DONE]\n\n"
|
yield "data: [DONE]\n\n"
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/responses", dependencies=[Depends(auth_guard)])
|
||||||
@app.post("/v1/responses", dependencies=[Depends(auth_guard)])
|
@app.post("/v1/responses", dependencies=[Depends(auth_guard)])
|
||||||
async def v1_responses(req: ResponsesRequest, request: Request):
|
async def v1_responses(req: ResponsesRequest, request: Request):
|
||||||
chat_req = _responses_to_chat_request(req)
|
chat_req = _responses_to_chat_request(req)
|
||||||
@@ -1388,7 +1731,13 @@ async def v1_messages(req: AnthropicMessagesRequest, request: Request):
|
|||||||
)
|
)
|
||||||
|
|
||||||
# ------------------------------------------------------------- session reuse
|
# ------------------------------------------------------------- session reuse
|
||||||
|
try:
|
||||||
tool_config = _anthropic_tool_config(req)
|
tool_config = _anthropic_tool_config(req)
|
||||||
|
except HTTPException as exc:
|
||||||
|
detail = exc.detail if isinstance(exc.detail, dict) else {}
|
||||||
|
error = detail.get("error") if isinstance(detail.get("error"), dict) else {}
|
||||||
|
message = error.get("message") or str(detail) or "invalid tool configuration"
|
||||||
|
return _anthropic_error(exc.status_code, "invalid_request_error", message)
|
||||||
has_tooling_context = _anthropic_has_tooling_context(req)
|
has_tooling_context = _anthropic_has_tooling_context(req)
|
||||||
|
|
||||||
ask_mode = _resolve_ask_mode(req.model, has_tooling_context)
|
ask_mode = _resolve_ask_mode(req.model, has_tooling_context)
|
||||||
@@ -1749,7 +2098,11 @@ async def v1_messages(req: AnthropicMessagesRequest, request: Request):
|
|||||||
if not saw_tool_event:
|
if not saw_tool_event:
|
||||||
forced_tool_name = _anthropic_forced_tool_name(req.tool_choice)
|
forced_tool_name = _anthropic_forced_tool_name(req.tool_choice)
|
||||||
if forced_tool_name:
|
if forced_tool_name:
|
||||||
fallback_event = _forced_tool_event_from_text(text, forced_tool_name)
|
fallback_event = _forced_tool_event_from_text(
|
||||||
|
text,
|
||||||
|
forced_tool_name,
|
||||||
|
single_arg_name=_tool_code_single_arg_name(req.tools, forced_tool_name),
|
||||||
|
)
|
||||||
if fallback_event is not None:
|
if fallback_event is not None:
|
||||||
content_blocks = []
|
content_blocks = []
|
||||||
tool_id = "toolu_fallback_0"
|
tool_id = "toolu_fallback_0"
|
||||||
|
|||||||
@@ -187,6 +187,17 @@ class ConfigParsingTests(unittest.TestCase):
|
|||||||
settings_without_accounts = load_settings()
|
settings_without_accounts = load_settings()
|
||||||
|
|
||||||
self.assertEqual(settings_without_accounts.instance_count, 1)
|
self.assertEqual(settings_without_accounts.instance_count, 1)
|
||||||
|
def test_load_settings_parses_tool_allowlist_csv(self) -> None:
|
||||||
|
with patch.dict(os.environ, {"TOOL_ALLOWLIST": " lookup , write_file ,,search_docs "}, clear=True):
|
||||||
|
settings = load_settings()
|
||||||
|
|
||||||
|
self.assertEqual(settings.tool_allowlist, ["lookup", "write_file", "search_docs"])
|
||||||
|
|
||||||
|
def test_load_settings_empty_tool_allowlist(self) -> None:
|
||||||
|
with patch.dict(os.environ, {"TOOL_ALLOWLIST": " , , "}, clear=True):
|
||||||
|
settings = load_settings()
|
||||||
|
|
||||||
|
self.assertEqual(settings.tool_allowlist, [])
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|||||||
@@ -263,6 +263,120 @@ class ToolCallBridgeTests(unittest.IsolatedAsyncioTestCase):
|
|||||||
{"query": "gateway"},
|
{"query": "gateway"},
|
||||||
)
|
)
|
||||||
|
|
||||||
|
async def test_openai_non_stream_fallbacks_to_tool_code_structured_tool_call_for_forced_tool(self) -> None:
|
||||||
|
fake_client = _FakeClient(
|
||||||
|
stream_events=[],
|
||||||
|
complete_result={
|
||||||
|
"text": "```tool_code\nlookup(query=\"gateway\")\n```",
|
||||||
|
"toolEvents": [],
|
||||||
|
"sessionId": "sess-fallback-tool-code-openai",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
req = ChatCompletionsRequest(
|
||||||
|
model="org_auto",
|
||||||
|
messages=[{"role": "user", "content": "hi"}],
|
||||||
|
stream=False,
|
||||||
|
tools=[{"type": "function", "function": {"name": "lookup", "parameters": {}}}],
|
||||||
|
tool_choice={"type": "function", "function": {"name": "lookup"}},
|
||||||
|
)
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(main, "pool", _FakePool(_FakeInstance(fake_client))),
|
||||||
|
patch.object(main, "chat_guard", _FakeGuard()),
|
||||||
|
patch.object(main, "_ensure_instance_logged_in", AsyncMock(return_value={"id": "u"})),
|
||||||
|
patch.object(main.stats_collector, "record_chat", AsyncMock(return_value=None)),
|
||||||
|
):
|
||||||
|
response = await main.v1_chat_completions(req, _make_request("/v1/chat/completions"))
|
||||||
|
|
||||||
|
payload = json.loads(response.body)
|
||||||
|
message = payload["choices"][0]["message"]
|
||||||
|
self.assertEqual(payload["choices"][0]["finish_reason"], "tool_calls")
|
||||||
|
self.assertEqual(message["content"], "")
|
||||||
|
self.assertEqual(message["tool_calls"][0]["function"]["name"], "lookup")
|
||||||
|
self.assertEqual(
|
||||||
|
json.loads(message["tool_calls"][0]["function"]["arguments"]),
|
||||||
|
{"query": "gateway"},
|
||||||
|
)
|
||||||
|
|
||||||
|
async def test_openai_non_stream_fallbacks_to_tool_code_structured_tool_call_for_forced_tool_with_positional_arg(self) -> None:
|
||||||
|
fake_client = _FakeClient(
|
||||||
|
stream_events=[],
|
||||||
|
complete_result={
|
||||||
|
"text": "```tool_code\nlookup(\"gateway\")\n```",
|
||||||
|
"toolEvents": [],
|
||||||
|
"sessionId": "sess-fallback-tool-code-openai-positional",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
req = ChatCompletionsRequest(
|
||||||
|
model="org_auto",
|
||||||
|
messages=[{"role": "user", "content": "hi"}],
|
||||||
|
stream=False,
|
||||||
|
tools=[
|
||||||
|
{
|
||||||
|
"type": "function",
|
||||||
|
"function": {
|
||||||
|
"name": "lookup",
|
||||||
|
"parameters": {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {"query": {"type": "string"}},
|
||||||
|
"required": ["query"],
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
],
|
||||||
|
tool_choice={"type": "function", "function": {"name": "lookup"}},
|
||||||
|
)
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(main, "pool", _FakePool(_FakeInstance(fake_client))),
|
||||||
|
patch.object(main, "chat_guard", _FakeGuard()),
|
||||||
|
patch.object(main, "_ensure_instance_logged_in", AsyncMock(return_value={"id": "u"})),
|
||||||
|
patch.object(main.stats_collector, "record_chat", AsyncMock(return_value=None)),
|
||||||
|
):
|
||||||
|
response = await main.v1_chat_completions(req, _make_request("/v1/chat/completions"))
|
||||||
|
|
||||||
|
payload = json.loads(response.body)
|
||||||
|
message = payload["choices"][0]["message"]
|
||||||
|
self.assertEqual(payload["choices"][0]["finish_reason"], "tool_calls")
|
||||||
|
self.assertEqual(message["content"], "")
|
||||||
|
self.assertEqual(message["tool_calls"][0]["function"]["name"], "lookup")
|
||||||
|
self.assertEqual(
|
||||||
|
json.loads(message["tool_calls"][0]["function"]["arguments"]),
|
||||||
|
{"query": "gateway"},
|
||||||
|
)
|
||||||
|
|
||||||
|
async def test_openai_stream_fallbacks_to_tool_code_structured_tool_call_for_forced_tool(self) -> None:
|
||||||
|
fake_client = _FakeClient(
|
||||||
|
stream_events=[
|
||||||
|
{"type": "text", "text": "```tool_code\n"},
|
||||||
|
{"type": "text", "text": 'lookup(query=\"gateway\")\n'},
|
||||||
|
{"type": "text", "text": "```"},
|
||||||
|
],
|
||||||
|
complete_result={},
|
||||||
|
)
|
||||||
|
req = ChatCompletionsRequest(
|
||||||
|
model="org_auto",
|
||||||
|
messages=[{"role": "user", "content": "hi"}],
|
||||||
|
stream=True,
|
||||||
|
tools=[{"type": "function", "function": {"name": "lookup", "parameters": {}}}],
|
||||||
|
tool_choice={"type": "function", "function": {"name": "lookup"}},
|
||||||
|
)
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(main, "pool", _FakePool(_FakeInstance(fake_client))),
|
||||||
|
patch.object(main, "chat_guard", _FakeGuard()),
|
||||||
|
patch.object(main, "_ensure_instance_logged_in", AsyncMock(return_value={"id": "u"})),
|
||||||
|
patch.object(main.stats_collector, "record_chat", AsyncMock(return_value=None)),
|
||||||
|
):
|
||||||
|
response = await main.v1_chat_completions(req, _make_request("/v1/chat/completions"))
|
||||||
|
body = await _collect_stream(response)
|
||||||
|
|
||||||
|
self.assertIn('"tool_calls"', body)
|
||||||
|
self.assertIn('"name": "lookup"', body)
|
||||||
|
self.assertIn('{"query": "gateway"}', body)
|
||||||
|
self.assertIn('"finish_reason": "tool_calls"', body)
|
||||||
|
self.assertIn('data: [DONE]', body)
|
||||||
|
|
||||||
async def test_openai_stream_bridges_tool_and_text_events(self) -> None:
|
async def test_openai_stream_bridges_tool_and_text_events(self) -> None:
|
||||||
fake_client = _FakeClient(
|
fake_client = _FakeClient(
|
||||||
stream_events=[
|
stream_events=[
|
||||||
@@ -300,6 +414,7 @@ class ToolCallBridgeTests(unittest.IsolatedAsyncioTestCase):
|
|||||||
self.assertIn('"usage"', body)
|
self.assertIn('"usage"', body)
|
||||||
self.assertIn("data: [DONE]", body)
|
self.assertIn("data: [DONE]", body)
|
||||||
|
|
||||||
|
|
||||||
async def test_anthropic_non_stream_bridges_tool_blocks(self) -> None:
|
async def test_anthropic_non_stream_bridges_tool_blocks(self) -> None:
|
||||||
fake_client = _FakeClient(
|
fake_client = _FakeClient(
|
||||||
stream_events=[],
|
stream_events=[],
|
||||||
@@ -605,6 +720,57 @@ class ToolCallBridgeTests(unittest.IsolatedAsyncioTestCase):
|
|||||||
self.assertEqual(spy_client.last_complete_args[2], "agent")
|
self.assertEqual(spy_client.last_complete_args[2], "agent")
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
async def test_openai_non_stream_filters_tools_by_allowlist(self) -> None:
|
||||||
|
spy_client = _SpyClient(stream_events=[], complete_result={"text": "ok", "toolEvents": []})
|
||||||
|
req = ChatCompletionsRequest(
|
||||||
|
model="org_auto",
|
||||||
|
messages=[{"role": "user", "content": "hi"}],
|
||||||
|
stream=False,
|
||||||
|
tools=[
|
||||||
|
{"type": "function", "function": {"name": "lookup", "parameters": {}}},
|
||||||
|
{"type": "function", "function": {"name": "write_file", "parameters": {}}},
|
||||||
|
],
|
||||||
|
tool_choice={"type": "function", "function": {"name": "lookup"}},
|
||||||
|
)
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(main, "pool", _FakePool(_FakeInstance(spy_client))),
|
||||||
|
patch.object(main, "chat_guard", _FakeGuard()),
|
||||||
|
patch.object(main, "_ensure_instance_logged_in", AsyncMock(return_value={"id": "u"})),
|
||||||
|
patch.object(main.stats_collector, "record_chat", AsyncMock(return_value=None)),
|
||||||
|
_SettingsPatch(tool_forward_enabled=True, tool_allowlist=["lookup"]),
|
||||||
|
):
|
||||||
|
await main.v1_chat_completions(req, _make_request("/v1/chat/completions"))
|
||||||
|
|
||||||
|
cfg = spy_client.last_complete_kwargs["tool_config"]
|
||||||
|
self.assertEqual([tool["function"]["name"] for tool in cfg["tools"]], ["lookup"])
|
||||||
|
self.assertEqual(cfg["tool_choice"], req.tool_choice)
|
||||||
|
|
||||||
|
async def test_openai_non_stream_rejects_forced_tool_outside_allowlist(self) -> None:
|
||||||
|
spy_client = _SpyClient(stream_events=[], complete_result={"text": "ok", "toolEvents": []})
|
||||||
|
req = ChatCompletionsRequest(
|
||||||
|
model="org_auto",
|
||||||
|
messages=[{"role": "user", "content": "hi"}],
|
||||||
|
stream=False,
|
||||||
|
tools=[{"type": "function", "function": {"name": "lookup", "parameters": {}}}],
|
||||||
|
tool_choice={"type": "function", "function": {"name": "write_file"}},
|
||||||
|
)
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(main, "pool", _FakePool(_FakeInstance(spy_client))),
|
||||||
|
patch.object(main, "chat_guard", _FakeGuard()),
|
||||||
|
patch.object(main, "_ensure_instance_logged_in", AsyncMock(return_value={"id": "u"})),
|
||||||
|
patch.object(main.stats_collector, "record_chat", AsyncMock(return_value=None)),
|
||||||
|
_SettingsPatch(tool_forward_enabled=True, tool_allowlist=["lookup"]),
|
||||||
|
):
|
||||||
|
with self.assertRaises(main.HTTPException) as cm:
|
||||||
|
await main.v1_chat_completions(req, _make_request("/v1/chat/completions"))
|
||||||
|
|
||||||
|
self.assertEqual(cm.exception.status_code, 400)
|
||||||
|
self.assertEqual(cm.exception.detail["error"]["type"], "invalid_request_error")
|
||||||
|
self.assertIn("write_file", cm.exception.detail["error"]["message"])
|
||||||
|
|
||||||
async def test_openai_tooling_context_disables_session_reuse_cache(self) -> None:
|
async def test_openai_tooling_context_disables_session_reuse_cache(self) -> None:
|
||||||
fake_cache = _FakeSessionCache()
|
fake_cache = _FakeSessionCache()
|
||||||
fake_client = _FakeClient(
|
fake_client = _FakeClient(
|
||||||
@@ -757,6 +923,74 @@ class ToolCallBridgeTests(unittest.IsolatedAsyncioTestCase):
|
|||||||
self.assertEqual(len(cfg["tools"]), 1)
|
self.assertEqual(len(cfg["tools"]), 1)
|
||||||
self.assertEqual(spy_client.last_complete_args[2], "agent")
|
self.assertEqual(spy_client.last_complete_args[2], "agent")
|
||||||
|
|
||||||
|
|
||||||
|
async def test_anthropic_non_stream_filters_tools_by_allowlist(self) -> None:
|
||||||
|
spy_client = _SpyClient(stream_events=[], complete_result={"text": "ok", "toolEvents": []})
|
||||||
|
req = AnthropicMessagesRequest(
|
||||||
|
model="claude-3-5-sonnet-20241022",
|
||||||
|
max_tokens=128,
|
||||||
|
messages=[{"role": "user", "content": "hi"}],
|
||||||
|
stream=False,
|
||||||
|
tools=[
|
||||||
|
{"name": "lookup", "input_schema": {"type": "object", "properties": {}}},
|
||||||
|
{"name": "write_file", "input_schema": {"type": "object", "properties": {}}},
|
||||||
|
],
|
||||||
|
tool_choice={"type": "tool", "name": "lookup"},
|
||||||
|
)
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(main, "pool", _FakePool(_FakeInstance(spy_client))),
|
||||||
|
patch.object(main, "chat_guard", _FakeGuard()),
|
||||||
|
patch.object(main, "_ensure_instance_logged_in", AsyncMock(return_value={"id": "u"})),
|
||||||
|
patch.object(main.stats_collector, "record_chat", AsyncMock(return_value=None)),
|
||||||
|
patch.object(main.settings, "api_keys", ["test-key"]),
|
||||||
|
_SettingsPatch(tool_forward_enabled=True, tool_allowlist=["lookup"]),
|
||||||
|
):
|
||||||
|
await main.v1_messages(
|
||||||
|
req,
|
||||||
|
_make_request(
|
||||||
|
"/v1/messages",
|
||||||
|
headers={"x-api-key": "test-key", "anthropic-version": "2023-06-01"},
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
cfg = spy_client.last_complete_kwargs["tool_config"]
|
||||||
|
self.assertEqual([tool["name"] for tool in cfg["tools"]], ["lookup"])
|
||||||
|
self.assertEqual(cfg["tool_choice"], req.tool_choice)
|
||||||
|
|
||||||
|
async def test_anthropic_non_stream_rejects_forced_tool_outside_allowlist(self) -> None:
|
||||||
|
spy_client = _SpyClient(stream_events=[], complete_result={"text": "ok", "toolEvents": []})
|
||||||
|
req = AnthropicMessagesRequest(
|
||||||
|
model="claude-3-5-sonnet-20241022",
|
||||||
|
max_tokens=128,
|
||||||
|
messages=[{"role": "user", "content": "hi"}],
|
||||||
|
stream=False,
|
||||||
|
tools=[{"name": "lookup", "input_schema": {"type": "object", "properties": {}}}],
|
||||||
|
tool_choice={"type": "tool", "name": "write_file"},
|
||||||
|
)
|
||||||
|
|
||||||
|
with (
|
||||||
|
patch.object(main, "pool", _FakePool(_FakeInstance(spy_client))),
|
||||||
|
patch.object(main, "chat_guard", _FakeGuard()),
|
||||||
|
patch.object(main, "_ensure_instance_logged_in", AsyncMock(return_value={"id": "u"})),
|
||||||
|
patch.object(main.stats_collector, "record_chat", AsyncMock(return_value=None)),
|
||||||
|
patch.object(main.settings, "api_keys", ["test-key"]),
|
||||||
|
_SettingsPatch(tool_forward_enabled=True, tool_allowlist=["lookup"]),
|
||||||
|
):
|
||||||
|
response = await main.v1_messages(
|
||||||
|
req,
|
||||||
|
_make_request(
|
||||||
|
"/v1/messages",
|
||||||
|
headers={"x-api-key": "test-key", "anthropic-version": "2023-06-01"},
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
self.assertEqual(response.status_code, 400)
|
||||||
|
payload = json.loads(response.body)
|
||||||
|
self.assertEqual(payload["type"], "error")
|
||||||
|
self.assertEqual(payload["error"]["type"], "invalid_request_error")
|
||||||
|
self.assertIn("write_file", payload["error"]["message"])
|
||||||
|
|
||||||
async def test_anthropic_tooling_context_disables_session_reuse_cache(self) -> None:
|
async def test_anthropic_tooling_context_disables_session_reuse_cache(self) -> None:
|
||||||
fake_cache = _FakeSessionCache()
|
fake_cache = _FakeSessionCache()
|
||||||
fake_client = _FakeClient(
|
fake_client = _FakeClient(
|
||||||
@@ -833,6 +1067,54 @@ class ToolCallBridgeTests(unittest.IsolatedAsyncioTestCase):
|
|||||||
messages_dump = [m.model_dump() for m in chat_req.messages]
|
messages_dump = [m.model_dump() for m in chat_req.messages]
|
||||||
self.assertEqual(messages_dump, [{"role": "user", "content": "hello from responses", "name": None, "tool_call_id": None, "tool_calls": None}])
|
self.assertEqual(messages_dump, [{"role": "user", "content": "hello from responses", "name": None, "tool_call_id": None, "tool_calls": None}])
|
||||||
|
|
||||||
|
async def test_responses_non_stream_maps_chat_tool_calls_to_function_call_output(self) -> None:
|
||||||
|
req = ResponsesRequest(
|
||||||
|
model="org_auto",
|
||||||
|
input="tool please",
|
||||||
|
stream=False,
|
||||||
|
)
|
||||||
|
chat_payload = {
|
||||||
|
"id": "chatcmpl-tools1",
|
||||||
|
"created": 234,
|
||||||
|
"model": "org_auto",
|
||||||
|
"choices": [
|
||||||
|
{
|
||||||
|
"index": 0,
|
||||||
|
"finish_reason": "tool_calls",
|
||||||
|
"message": {
|
||||||
|
"role": "assistant",
|
||||||
|
"content": "",
|
||||||
|
"tool_calls": [
|
||||||
|
{
|
||||||
|
"id": "call_1",
|
||||||
|
"type": "function",
|
||||||
|
"function": {
|
||||||
|
"name": "lookup",
|
||||||
|
"arguments": "{\"q\":\"gateway\"}",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"usage": {"prompt_tokens": 8, "completion_tokens": 3, "total_tokens": 11},
|
||||||
|
}
|
||||||
|
|
||||||
|
mock_chat = AsyncMock(return_value=JSONResponse(content=chat_payload))
|
||||||
|
with patch.object(main, "v1_chat_completions", mock_chat):
|
||||||
|
response = await main.v1_responses(req, _make_request("/v1/responses"))
|
||||||
|
|
||||||
|
payload = json.loads(response.body)
|
||||||
|
self.assertEqual(payload["status"], "completed")
|
||||||
|
self.assertEqual(payload["output_text"], "")
|
||||||
|
self.assertEqual(payload["usage"], {"input_tokens": 8, "output_tokens": 3, "total_tokens": 11})
|
||||||
|
self.assertEqual(len(payload["output"]), 1)
|
||||||
|
self.assertEqual(payload["output"][0]["type"], "function_call")
|
||||||
|
self.assertEqual(payload["output"][0]["call_id"], "call_1")
|
||||||
|
self.assertEqual(payload["output"][0]["id"], "call_1")
|
||||||
|
self.assertEqual(payload["output"][0]["name"], "lookup")
|
||||||
|
self.assertEqual(payload["output"][0]["arguments"], "{\"q\":\"gateway\"}")
|
||||||
|
|
||||||
async def test_responses_forwards_input_tools_and_tool_choice_to_chat_request(self) -> None:
|
async def test_responses_forwards_input_tools_and_tool_choice_to_chat_request(self) -> None:
|
||||||
req = ResponsesRequest(
|
req = ResponsesRequest(
|
||||||
model="org_auto",
|
model="org_auto",
|
||||||
@@ -883,17 +1165,70 @@ class ToolCallBridgeTests(unittest.IsolatedAsyncioTestCase):
|
|||||||
response = await main.v1_responses(req, _make_request("/v1/responses"))
|
response = await main.v1_responses(req, _make_request("/v1/responses"))
|
||||||
body = await _collect_stream(response)
|
body = await _collect_stream(response)
|
||||||
|
|
||||||
self.assertIn('"type": "response.created"', body)
|
self.assertIn('"type": "response.output_item.added"', body)
|
||||||
self.assertIn('"type": "response.output_text.delta"', body)
|
self.assertIn('"type": "response.output_text.delta"', body)
|
||||||
self.assertIn('"delta": "hello"', body)
|
self.assertIn('"delta": "hello"', body)
|
||||||
self.assertIn('"type": "response.function_call.delta"', body)
|
self.assertIn('"type": "response.function_call_arguments.delta"', body)
|
||||||
self.assertIn('"item_id": "call_1"', body)
|
self.assertIn('"item_id": "call_1"', body)
|
||||||
|
self.assertIn('"output_index": 1', body)
|
||||||
|
self.assertIn('"delta": "{\\"q\\": \\"x\\"}"', body)
|
||||||
|
self.assertIn('"type": "response.function_call_arguments.done"', body)
|
||||||
|
self.assertIn('"arguments": "{\\"q\\": \\"x\\"}"', body)
|
||||||
|
self.assertIn('"type": "response.output_item.done"', body)
|
||||||
|
self.assertIn('"type": "function_call"', body)
|
||||||
self.assertIn('"name": "lookup"', body)
|
self.assertIn('"name": "lookup"', body)
|
||||||
|
self.assertIn('"arguments": "{\\"q\\": \\"x\\"}"', body)
|
||||||
self.assertIn('"type": "response.completed"', body)
|
self.assertIn('"type": "response.completed"', body)
|
||||||
self.assertIn('"input_tokens": 3', body)
|
self.assertIn('"input_tokens": 3', body)
|
||||||
self.assertIn('"output_tokens": 2', body)
|
self.assertIn('"output_tokens": 2', body)
|
||||||
self.assertIn('data: [DONE]', body)
|
self.assertIn('data: [DONE]', body)
|
||||||
|
|
||||||
|
async def test_responses_stream_accumulates_fragmented_tool_arguments(self) -> None:
|
||||||
|
async def _chat_sse():
|
||||||
|
yield b'data: {"choices": [{"delta": {"tool_calls": [{"id": "call_1", "function": {"name": "lookup", "arguments": "{\\"q\\":"}}]}}]}\n\n'
|
||||||
|
yield b'data: {"choices": [{"delta": {"tool_calls": [{"id": "call_1", "function": {"name": "lookup", "arguments": " \\\"x\\\"}"}}]}}]}\n\n'
|
||||||
|
yield b"data: [DONE]\n\n"
|
||||||
|
|
||||||
|
req = ResponsesRequest(model="org_auto", input="hi", stream=True)
|
||||||
|
mock_chat = AsyncMock(
|
||||||
|
return_value=StreamingResponse(_chat_sse(), media_type="text/event-stream")
|
||||||
|
)
|
||||||
|
|
||||||
|
with patch.object(main, "v1_chat_completions", mock_chat):
|
||||||
|
response = await main.v1_responses(req, _make_request("/v1/responses"))
|
||||||
|
body = await _collect_stream(response)
|
||||||
|
|
||||||
|
self.assertIn('"type": "response.function_call_arguments.delta"', body)
|
||||||
|
self.assertIn('"delta": "{\\"q\\":"', body)
|
||||||
|
self.assertIn('"delta": " \\\"x\\\"}"', body)
|
||||||
|
self.assertIn('"type": "response.function_call_arguments.done"', body)
|
||||||
|
self.assertIn('"arguments": "{\\"q\\": \\\"x\\\"}"', body)
|
||||||
|
self.assertIn('"type": "response.output_item.done"', body)
|
||||||
|
self.assertIn('"arguments": "{\\"q\\": \\\"x\\\"}"', body)
|
||||||
|
self.assertIn('data: [DONE]', body)
|
||||||
|
|
||||||
|
async def test_responses_stream_accumulates_fragmented_tool_arguments_without_repeated_id_or_name(self) -> None:
|
||||||
|
async def _chat_sse():
|
||||||
|
yield b'data: {"choices": [{"delta": {"tool_calls": [{"index": 0, "id": "call_1", "function": {"name": "lookup", "arguments": "{\\"q\\":"}}]}}]}\n\n'
|
||||||
|
yield b'data: {"choices": [{"delta": {"tool_calls": [{"index": 0, "function": {"arguments": " \\\"x\\\"}"}}]}}]}\n\n'
|
||||||
|
yield b"data: [DONE]\n\n"
|
||||||
|
|
||||||
|
req = ResponsesRequest(model="org_auto", input="hi", stream=True)
|
||||||
|
mock_chat = AsyncMock(
|
||||||
|
return_value=StreamingResponse(_chat_sse(), media_type="text/event-stream")
|
||||||
|
)
|
||||||
|
|
||||||
|
with patch.object(main, "v1_chat_completions", mock_chat):
|
||||||
|
response = await main.v1_responses(req, _make_request("/v1/responses"))
|
||||||
|
body = await _collect_stream(response)
|
||||||
|
|
||||||
|
self.assertEqual(body.count('"item_id": "call_1"'), 3)
|
||||||
|
self.assertIn('"name": "lookup"', body)
|
||||||
|
self.assertIn('"delta": "{\\"q\\":"', body)
|
||||||
|
self.assertIn('"delta": " \\\"x\\\"}"', body)
|
||||||
|
self.assertIn('"arguments": "{\\"q\\": \\\"x\\\"}"', body)
|
||||||
|
self.assertIn('data: [DONE]', body)
|
||||||
|
|
||||||
async def test_responses_stream_emits_completed_when_upstream_closes_without_done(self) -> None:
|
async def test_responses_stream_emits_completed_when_upstream_closes_without_done(self) -> None:
|
||||||
async def _chat_sse_without_done():
|
async def _chat_sse_without_done():
|
||||||
yield b'data: {"choices": [{"delta": {"content": "partial"}}]}\n\n'
|
yield b'data: {"choices": [{"delta": {"content": "partial"}}]}\n\n'
|
||||||
@@ -954,7 +1289,31 @@ class ToolCallBridgeTests(unittest.IsolatedAsyncioTestCase):
|
|||||||
self.assertIn('data: [DONE]', body)
|
self.assertIn('data: [DONE]', body)
|
||||||
|
|
||||||
|
|
||||||
async def test_responses_non_stream_returns_502_on_invalid_upstream_json(self) -> None:
|
async def test_responses_alias_matches_v1_responses_behavior(self) -> None:
|
||||||
|
req = ResponsesRequest(model="org_auto", input="hello", stream=False)
|
||||||
|
chat_payload = {
|
||||||
|
"id": "chatcmpl-alias1",
|
||||||
|
"created": 123,
|
||||||
|
"model": "org_auto",
|
||||||
|
"choices": [
|
||||||
|
{
|
||||||
|
"index": 0,
|
||||||
|
"finish_reason": "stop",
|
||||||
|
"message": {"role": "assistant", "content": "done"},
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"usage": {"prompt_tokens": 4, "completion_tokens": 2, "total_tokens": 6},
|
||||||
|
}
|
||||||
|
|
||||||
|
mock_chat = AsyncMock(return_value=JSONResponse(content=chat_payload))
|
||||||
|
with patch.object(main, "v1_chat_completions", mock_chat):
|
||||||
|
response = await main.v1_responses(req, _make_request("/responses"))
|
||||||
|
|
||||||
|
payload = json.loads(response.body)
|
||||||
|
self.assertEqual(payload["id"], "resp_alias1")
|
||||||
|
self.assertEqual(payload["status"], "completed")
|
||||||
|
mock_chat.assert_awaited_once()
|
||||||
|
|
||||||
req = ResponsesRequest(model="org_auto", input="hi", stream=False)
|
req = ResponsesRequest(model="org_auto", input="hi", stream=False)
|
||||||
mock_chat = AsyncMock(return_value=Response(content="not-json", media_type="text/plain"))
|
mock_chat = AsyncMock(return_value=Response(content="not-json", media_type="text/plain"))
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user