feat: harden agent tooling compatibility

2026-04-30 02:39:38 +08:00
parent 70803e5c76
commit 8c2df92ce7
14 changed files with 991 additions and 52 deletions
--- a/README.zh-CN.md
+++ b/README.zh-CN.md
@@ -11,7 +11,7 @@

 ## 当前版本

-当前桌面端版本线：`v1.2.2`
+当前桌面端版本线：`v1.3.0`

 GitHub Actions 会在 Release 中产出：

@@ -75,7 +75,10 @@ GitHub Actions 会在 Release 中产出：
 | `/` | GET | 健康检查 |
 | `/health` | GET | 健康检查 |
 | `/v1/models` | GET | 获取 Lingma 可用模型列表 |
+| `/capabilities` / `/v1/capabilities` | GET | 能力探测，给第三方 Agent 识别协议、工具、图片能力 |
+| `/api/v1/models` / `/api/tags` / `/props` | GET | LM Studio / Ollama / llama.cpp / vLLM 风格探测兼容 |
 | `/v1/chat/completions` | POST | OpenAI Chat Completions 兼容接口 |
+| `/api/v1/chat/completions` | POST | OpenAI Chat Completions 别名 |
 | `/v1/messages` | POST | Anthropic Messages 兼容接口 |

 ## 我们自己增强的能力
@@ -84,7 +87,12 @@ GitHub Actions 会在 Release 中产出：

 - **Function Calling / Tools 兼容**：同时兼容 OpenAI `tools/tool_choice` 和 Anthropic `tools/tool_choice`。
 - **工具结果接力**：支持多轮 Agent 工具调用，把工具结果继续回灌给 Lingma 生成最终回答。
+- **工具稳定性增强**：代理层自动生成工具路由表，给 `read_file` / `search_files` / `terminal` / `web_search` 注入专门示例；当模型说“无法访问 / 请手动运行 / 请粘贴文件”时自动重试工具调用。
+- **工具别名映射**：兼容常见模型输出的 `Bash` -> `terminal`、`Read` -> `read_file`、`Grep` -> `search_files`、`Edit` -> `patch`。
 - **图片输入**：兼容 OpenAI `image_url` 和 Anthropic base64 image block。
+- **本地图片路径兼容**：OpenAI `image_url.url` 支持 data URL、HTTP URL、`file://`、绝对路径和 `~/` 路径。
+- **图片自动压缩**：大图会自动缩放并转 JPEG，避免 Lingma 被超大 base64 卡死。
+- **日志图片脱敏**：桌面端请求详情会把图片 base64 标记为图片载荷，不再把巨大字符串撑爆 UI。
 - **更完整的参数兼容**：接收 `temperature`、`top_p`、`stop`、`max_tokens`、`response_format`、`reasoning_effort` 等客户端常用字段。
 - **完整请求 / 响应观测**：桌面端可以查看完整请求体、响应体、状态码、耗时和错误日志，便于排查 Claude Code / Cline 里的 400、500 问题。
 - **跨平台桌面 App**：提供启动、停止、重启、模型探测、设置、日志、主题、窗口生命周期等完整桌面能力。
@@ -196,7 +204,7 @@ CLI 也可以手动指定：

 ```bash
 lingma-ipc-proxy --transport websocket --ws-url ws://127.0.0.1:36510 --port 8095
-lingma-ipc-proxy --transport pipe --pipe-name '\\.\pipe\lingma-ipc'
+lingma-ipc-proxy --transport pipe --pipe '\\.\pipe\lingma-ipc'
 ```

 ## 快速开始
@@ -251,7 +259,7 @@ export ANTHROPIC_API_KEY="any"
 然后在 Claude Code 中选择模型：

 ```text
-/model Qwen3-Coder
+/model MiniMax-M2.7
 ```

 ### Cline
@@ -260,7 +268,7 @@ export ANTHROPIC_API_KEY="any"

 - Base URL：`http://127.0.0.1:8095/v1`
 - API Key：`any`
- Model ID：`Qwen3-Coder`
+- Model ID：`MiniMax-M2.7`

 ### Continue

@@ -270,7 +278,7 @@ export ANTHROPIC_API_KEY="any"
    {
      "title": "Lingma Proxy",
      "provider": "openai",
-      "model": "Qwen3-Coder",
+      "model": "MiniMax-M2.7",
      "apiKey": "any",
      "apiBase": "http://127.0.0.1:8095/v1"
    }
@@ -287,14 +295,26 @@ export ANTHROPIC_API_KEY="any"
 | 模型 | 说明 |
 | --- | --- |
 | `Auto` | Lingma 自动路由模型，桌面端使用通用自动图标 |
-| `Qwen3-Coder` | 代码和工具调用优先推荐 |
+| `Qwen3-Coder` | 代码专项备选 |
 | `Qwen3-Max` | 通用能力较强 |
 | `Qwen3-Thinking` | 推理类模型 |
 | `Qwen3.6-Plus` | 通用模型 |
-| `Kimi-K2.6` | 长文本模型 |
-| `MiniMax-M2.7` | 通用模型 |
+| `Kimi-K2.6` | 多模态和长上下文模型 |
+| `MiniMax-M2.7` | 第三方 Agent 默认推荐 |

-需要工具调用时，优先使用 `Qwen3-Coder`。
+### 模型参数来源和推荐
+
+代理不会凭空写死 Lingma 没公开的模型参数。下面的上下文长度和能力只在有官方或模型卡来源时写入；没有权威来源的模型只标注“本地实测”。
+
+| 模型 | 推荐场景 | 参数 / 能力依据 |
+| --- | --- | --- |
+| `MiniMax-M2.7` | 默认推荐给 OpenClaw / Hermes / Claude Code / Cline 这类第三方 Agent | NVIDIA 的 [MiniMax M2.7 模型卡](https://developer.nvidia.com/blog/minimax-m2-7-advances-scalable-agentic-workflows-on-nvidia-platforms-for-complex-ai-applications/) 标注 200K input context、MoE 语言模型和 agentic 场景；本地代理压测 read/search/terminal/web/patch/vision 全部通过，平均延迟最低。 |
+| `Kimi-K2.6` | 多模态、长上下文、复杂 Agent 工作流 | Kimi [官方 API 文档](https://platform.kimi.ai/docs/guide/kimi-k2-6-quickstart) 标注原生 text/image/video、多步工具调用和 256K 上下文。 |
+| `Qwen3-Coder` | 代码专项和工具协议备选 | Qwen [官方博客](https://qwenlm.github.io/blog/qwen3-coder/) 标注 256K 原生上下文、可扩展到 1M，以及 agentic coding / function calling 协议。 |
+| `Qwen3.6-Plus` | 通用 / 视觉备选 | Lingma 暴露且本地实测可用，但本仓库没有找到 Lingma 专属的官方上下文长度来源。 |
+| `Qwen3-Max` | 快速通用 / 视觉备选 | 简单工具和视觉测试表现好，但强制 read/patch 场景在本代理里不如 MiniMax / Kimi 稳。 |
+
+当客户端请求没有携带 `model` 字段时，代理默认使用：`MiniMax-M2.7`。

 ## 配置文件

@@ -360,7 +380,14 @@ Lingma 插件本身没有公开标准 OpenAI / Anthropic Tools 协议，所以
 5. 重新编码成 OpenAI `tool_calls` 或 Anthropic `tool_use`。
 6. 将工具执行结果回灌给 Lingma，继续生成最终回答。

-该方案依赖模型配合，目前 `Qwen3-Coder` 最稳定。
+当前版本对工具调用做了这些增强：
+
+- 根据客户端传入的工具名自动生成“工具路由表”。
+- 对 `read_file`、`search_files`、`terminal`、`web_search` 注入专门示例。
+- 当模型回答“无法访问文件 / 无法联网 / 请手动运行 / 请粘贴内容”时，代理会自动追加强制工具调用提示并重试一次。
+- 自动归一化常见工具名别名：`Bash`、`Shell`、`Read`、`Grep`、`Edit`、`Fetch` 等。
+
+本地压测结果：`MiniMax-M2.7`、`Kimi-K2.6`、`Qwen3.6-Plus`、`Qwen3-Coder` 均通过 read/search/terminal/web/patch/vision 烟测；其中 `MiniMax-M2.7` 平均延迟最低，所以作为默认推荐。

 ## 请求和日志观测

@@ -415,8 +442,8 @@ Lingma IPC Proxy
 发布方式：

 ```bash
-git tag v1.2.2
-git push origin v1.2.2
+git tag v1.3.0
+git push origin v1.3.0
 ```

 也可以在 GitHub Actions 页面手动运行 `Release` workflow，并输入 tag。