Release v1.4.9 remote image routing
This commit is contained in:
@@ -2,6 +2,15 @@
|
||||
|
||||
## Unreleased
|
||||
|
||||
## v1.4.9 - 2026-05-07
|
||||
|
||||
- Added Remote-mode image routing: image requests now use the proven Lingma IPC image pipeline instead of sending local/data URLs directly to the remote chat endpoint.
|
||||
- Added mixed image + tool handling: the proxy extracts image context through IPC, then returns to Remote API native tool calling so clients still receive proper `tool_calls` / `tool_use`.
|
||||
- Fixed multi-turn image follow-ups by reusing the most recent user image from request history when the latest user turn says things like "continue based on the previous image".
|
||||
- Improved Remote API tool compatibility by forwarding structured messages, tool definitions, tool choice, and native remote tool-call deltas instead of prompt-emulating tools in Remote mode.
|
||||
- Added regression tests for remote structured tools, image routing, image-context injection, and previous-turn image reuse.
|
||||
- Verified the production desktop app launch path from `/Applications/Lingma Proxy.app`, including pure image, multi-turn image, and image + forced tool-call requests.
|
||||
|
||||
## v1.4.8 - 2026-05-06
|
||||
|
||||
- Fixed Remote API base URL auto-detection so Lingma OSS/static asset hosts are rejected and cannot be used as API endpoints.
|
||||
|
||||
@@ -13,7 +13,7 @@ The proxy now supports two backend modes:
|
||||
|
||||
## Current Version
|
||||
|
||||
The current desktop line is `v1.4.8`.
|
||||
The current desktop line is `v1.4.9`.
|
||||
|
||||
See [CHANGELOG.md](./CHANGELOG.md) for release history.
|
||||
|
||||
@@ -90,6 +90,7 @@ Compared with the original protocol proof of concept, this repository focuses on
|
||||
- **Anthropic streaming tool-call hardening** so streaming clients such as Claude Code receive final `tool_use` events instead of premature refusal text when tools are present.
|
||||
- **Image input** for OpenAI `image_url` and Anthropic image blocks.
|
||||
- **Local and remote image normalization** for data URLs, HTTP URLs, `file://` URLs, and absolute local paths, with automatic JPEG downscaling for large images.
|
||||
- **Remote-mode image fallback** so image requests use the proven Lingma IPC image pipeline; image + tool requests extract image context through IPC and then return to Remote API native tool calling.
|
||||
- **Request log image redaction** so large base64 payloads are visible as image markers instead of breaking the desktop log view.
|
||||
- **More request parameter compatibility** so stricter clients can connect without custom patches.
|
||||
- **Full request and response recording** in the desktop app for debugging 400/500 errors.
|
||||
@@ -130,9 +131,12 @@ flowchart LR
|
||||
Service --> Session["Session Manager"]
|
||||
Service --> Tools["Tool Emulation"]
|
||||
Service --> Models["Model Discovery"]
|
||||
Service --> Images["Image Router"]
|
||||
Service --> Backend{"Backend Mode"}
|
||||
Backend --> Transport["IPC Plugin Transport"]
|
||||
Backend --> Remote["Remote API Client"]
|
||||
Images -->|"image requests"| Transport
|
||||
Images -->|"image + tools: extract context"| Remote
|
||||
Transport --> Pipe["Windows Named Pipe"]
|
||||
Transport --> WS["macOS / Windows WebSocket"]
|
||||
Pipe --> Lingma["Tongyi Lingma IDE Plugin"]
|
||||
@@ -221,6 +225,7 @@ Notes:
|
||||
- If your Lingma plugin uses a dedicated domain, remote mode first uses `--remote-base-url`, `LINGMA_REMOTE_BASE_URL`, or the JSON config field. If those are empty, it scans Lingma's local logs on macOS, Windows, and Linux for endpoint hints such as `endpoint config:` and marketplace service URLs.
|
||||
- The desktop Settings page shows the resolved remote domain and detection source without exposing tokens.
|
||||
- `/v1/models` in remote mode returns remote API model keys, which may not match the IPC plugin display IDs such as `MiniMax-M2.7` or `Kimi-K2.6`.
|
||||
- Image requests in remote mode are routed through the IPC image pipeline because the direct remote chat endpoint ignores local `file://` and data URL image payloads. If a request also contains tools, Lingma Proxy first extracts image context through IPC and then sends the tool-capable turn through Remote API native tool calling.
|
||||
- Local validation passed `/health`, `/v1/models`, OpenAI streaming/non-streaming chat, and Claude Code Anthropic + Bash tool use. Claude Code full tool runs are much slower than simple OpenAI requests because the client sends a large context and performs a second tool-result turn.
|
||||
- This mode is inspired by the remote API and credential-signing research in [ZipperCode/lingma2api](https://github.com/ZipperCode/lingma2api), integrated here as a switchable backend under the existing OpenAI / Anthropic / desktop app architecture.
|
||||
|
||||
|
||||
@@ -16,7 +16,7 @@
|
||||
|
||||
## 当前版本
|
||||
|
||||
当前桌面端版本线:`v1.4.8`
|
||||
当前桌面端版本线:`v1.4.9`
|
||||
|
||||
版本更新记录见 [CHANGELOG.md](./CHANGELOG.md)。
|
||||
|
||||
@@ -53,6 +53,7 @@ GitHub Actions 会在 Release 中产出:
|
||||
| Function Calling / Tools | 支持,使用工具调用模拟实现 |
|
||||
| 多轮 Agent 工具循环 | 支持 |
|
||||
| 图片输入 | 支持 base64、data URL、HTTP URL |
|
||||
| 远端模式图片兜底 | 有图请求使用 IPC 图片链路;图片 + 工具请求先提取图片上下文,再回到 Remote API 原生工具调用 |
|
||||
| 请求 / 响应完整日志 | 桌面端支持完整查看和复制 |
|
||||
| 后端模式切换 | 支持 IPC 插件模式 / 远端 API 模式 |
|
||||
| macOS WebSocket 自动探测 | 支持 |
|
||||
@@ -178,9 +179,12 @@ flowchart LR
|
||||
Service --> Tooling["工具调用模拟"]
|
||||
Service --> Model["模型探测"]
|
||||
Service --> Recorder["请求 / 日志记录"]
|
||||
Service --> Images["图片路由"]
|
||||
Service --> Backend{"后端模式"}
|
||||
Backend --> Transport["IPC 插件传输层"]
|
||||
Backend --> Remote["远端 API 客户端"]
|
||||
Images -->|"有图请求"| Transport
|
||||
Images -->|"图片 + 工具:提取图片上下文"| Remote
|
||||
Transport --> Pipe["Windows Named Pipe"]
|
||||
Transport --> WS["WebSocket"]
|
||||
Pipe --> Lingma["通义灵码 IDE 插件"]
|
||||
@@ -287,6 +291,7 @@ lingma-proxy \
|
||||
- 如果 Lingma 插件配置过专属域名,远端模式会优先使用 `--remote-base-url`、`LINGMA_REMOTE_BASE_URL` 或配置文件;这些为空时,会扫描 macOS、Windows、Linux 上 Lingma 本地日志里的 `endpoint config:`、Marketplace service URL 等线索。
|
||||
- 桌面端设置页会展示当前解析到的远端域名和来源,但不会展示 token / key 明文。
|
||||
- 远端模式的 `/v1/models` 返回的是远端接口模型 key,不一定等同于 IPC 插件模式里看到的 `MiniMax-M2.7`、`Kimi-K2.6` 等展示名。
|
||||
- 远端模式下的图片请求会自动走 IPC 图片链路,因为直连远端聊天接口不会直接消费本地 `file://` 和 data URL 图片。若请求同时带工具,代理会先通过 IPC 提取图片上下文,再把不含图片但包含上下文的请求交给 Remote API 原生工具调用。
|
||||
- 当前本机实测:`/health`、`/v1/models`、OpenAI 流式 / 非流式、Claude Code Anthropic + Bash 工具调用均可用;Claude Code 完整工具链耗时明显高于简单 OpenAI 请求。
|
||||
- 该模式参考了 [ZipperCode/lingma2api](https://github.com/ZipperCode/lingma2api) 对 Lingma 远端接口、签名和登录态结构的探索,本仓库将其作为可切换后端集成到现有 OpenAI / Anthropic / 桌面 App 架构中。
|
||||
|
||||
|
||||
@@ -252,7 +252,7 @@ onUnmounted(() => {
|
||||
<span class="status-dot" :class="{ running: status.running }"></span>
|
||||
<div>
|
||||
<strong>{{ status.running ? 'Proxy Running' : 'Proxy Stopped' }}</strong>
|
||||
<small>v1.4.8</small>
|
||||
<small>v1.4.9</small>
|
||||
</div>
|
||||
</div>
|
||||
</aside>
|
||||
|
||||
@@ -11,6 +11,6 @@
|
||||
"email": "lutc5@asiainfo.com"
|
||||
},
|
||||
"info": {
|
||||
"productVersion": "1.4.8"
|
||||
"productVersion": "1.4.9"
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1208,7 +1208,7 @@ func (s *Server) handleOpenAIStream(w http.ResponseWriter, r *http.Request, req
|
||||
}
|
||||
|
||||
func shouldAggregateToolStream(req service.ChatRequest) bool {
|
||||
return len(req.Tools) > 0 && truthyEnv("LINGMA_AGGREGATE_TOOL_STREAM")
|
||||
return len(req.Tools) > 0
|
||||
}
|
||||
|
||||
type toolStreamFilter struct {
|
||||
@@ -1450,20 +1450,18 @@ func normalizeAnthropicRequest(req anthropicRequest) (service.ChatRequest, error
|
||||
case "user":
|
||||
text, toolResults := extractAnthropicUserContent(message.Content)
|
||||
images := extractAnthropicImages(message.Content)
|
||||
for _, tr := range toolResults {
|
||||
prompt := toolemulation.ActionOutputPrompt(tr.ToolUseID, tr.Content)
|
||||
if prompt != "" {
|
||||
messages = append(messages, service.ChatMessage{Role: "user", Text: prompt})
|
||||
}
|
||||
}
|
||||
if text != "" || len(images) > 0 {
|
||||
messages = append(messages, service.ChatMessage{Role: role, Text: text, Images: images})
|
||||
}
|
||||
for _, tr := range toolResults {
|
||||
if strings.TrimSpace(tr.Content) != "" {
|
||||
messages = append(messages, service.ChatMessage{Role: "tool", Text: tr.Content, ToolCallID: tr.ToolUseID})
|
||||
}
|
||||
}
|
||||
case "assistant":
|
||||
text, calls := extractAnthropicAssistantContent(message.Content)
|
||||
projected := toolemulation.AssistantToolCallsToText(text, calls)
|
||||
if projected != "" {
|
||||
messages = append(messages, service.ChatMessage{Role: role, Text: projected})
|
||||
if text != "" || len(calls) > 0 {
|
||||
messages = append(messages, service.ChatMessage{Role: role, Text: text, ToolCalls: calls})
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1510,19 +1508,15 @@ func normalizeOpenAIRequest(req openAIChatRequest) (service.ChatRequest, error)
|
||||
case "assistant":
|
||||
text := strings.TrimSpace(extractText(message.Content))
|
||||
calls := extractOpenAIToolCalls(message.ToolCalls)
|
||||
projected := toolemulation.AssistantToolCallsToText(text, calls)
|
||||
if projected != "" {
|
||||
messages = append(messages, service.ChatMessage{Role: role, Text: projected})
|
||||
if text != "" || len(calls) > 0 {
|
||||
messages = append(messages, service.ChatMessage{Role: role, Text: text, ToolCalls: calls})
|
||||
}
|
||||
case "tool":
|
||||
output := strings.TrimSpace(extractText(message.Content))
|
||||
if output == "" || message.ToolCallID == "" {
|
||||
continue
|
||||
}
|
||||
prompt := toolemulation.ActionOutputPrompt(message.ToolCallID, output)
|
||||
if prompt != "" {
|
||||
messages = append(messages, service.ChatMessage{Role: "user", Text: prompt})
|
||||
}
|
||||
messages = append(messages, service.ChatMessage{Role: "tool", Text: output, ToolCallID: message.ToolCallID})
|
||||
}
|
||||
}
|
||||
if len(messages) == 0 {
|
||||
|
||||
@@ -17,6 +17,8 @@ import (
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"lingma-ipc-proxy/internal/toolemulation"
|
||||
)
|
||||
|
||||
const (
|
||||
@@ -55,8 +57,27 @@ type Model struct {
|
||||
type ChatRequest struct {
|
||||
Model string
|
||||
Prompt string
|
||||
Messages []Message
|
||||
Images []Image
|
||||
Stream bool
|
||||
Temperature *float64
|
||||
Tools []toolemulation.ToolDef
|
||||
ToolChoice toolemulation.ToolChoice
|
||||
}
|
||||
|
||||
type Image struct {
|
||||
MediaType string
|
||||
Data string
|
||||
URL string
|
||||
}
|
||||
|
||||
type Message struct {
|
||||
Role string
|
||||
Content string
|
||||
Images []Image
|
||||
Name string
|
||||
ToolCallID string
|
||||
ToolCalls []toolemulation.ToolCall
|
||||
}
|
||||
|
||||
type ChatResult struct {
|
||||
@@ -65,6 +86,7 @@ type ChatResult struct {
|
||||
OutputTokens int
|
||||
RequestID string
|
||||
CredentialSrc string
|
||||
ToolCalls []toolemulation.ToolCall
|
||||
}
|
||||
|
||||
type StreamEvent struct {
|
||||
@@ -186,10 +208,14 @@ func (c *Client) Chat(ctx context.Context, request ChatRequest, onDelta func(str
|
||||
return nil, fmt.Errorf("remote chat status %d: %s", resp.StatusCode, truncate(string(respBody), 1000))
|
||||
}
|
||||
var builder strings.Builder
|
||||
toolCallBuffer := newRemoteToolCallBuffer()
|
||||
if err := scanSSE(resp.Body, func(event sseEvent) error {
|
||||
if event.Done {
|
||||
return nil
|
||||
}
|
||||
if len(event.ToolCalls) > 0 {
|
||||
toolCallBuffer.Add(event.ToolCalls)
|
||||
}
|
||||
if event.Content == "" {
|
||||
return nil
|
||||
}
|
||||
@@ -208,6 +234,7 @@ func (c *Client) Chat(ctx context.Context, request ChatRequest, onDelta func(str
|
||||
OutputTokens: estimateTokens(text),
|
||||
RequestID: requestID,
|
||||
CredentialSrc: cred.Source,
|
||||
ToolCalls: toolCallBuffer.Calls(),
|
||||
}, nil
|
||||
}
|
||||
|
||||
@@ -220,12 +247,13 @@ func (c *Client) buildBody(requestID string, request ChatRequest) (string, error
|
||||
if strings.EqualFold(model, "auto") {
|
||||
model = ""
|
||||
}
|
||||
imageURLs := projectImages(request.Images)
|
||||
payload := map[string]any{
|
||||
"request_id": requestID,
|
||||
"request_set_id": "",
|
||||
"chat_record_id": requestID,
|
||||
"stream": true,
|
||||
"image_urls": nil,
|
||||
"image_urls": nullableSlice(imageURLs),
|
||||
"is_reply": false,
|
||||
"is_retry": false,
|
||||
"session_id": "",
|
||||
@@ -242,26 +270,14 @@ func (c *Client) buildBody(requestID string, request ChatRequest) (string, error
|
||||
"display_name": "",
|
||||
"model": model,
|
||||
"format": "",
|
||||
"is_vl": false,
|
||||
"is_vl": len(imageURLs) > 0,
|
||||
"is_reasoning": false,
|
||||
"api_key": "",
|
||||
"url": "",
|
||||
"source": "",
|
||||
"enable": false,
|
||||
},
|
||||
"messages": []map[string]any{{
|
||||
"role": "user",
|
||||
"content": request.Prompt,
|
||||
"response_meta": map[string]any{
|
||||
"id": "",
|
||||
"usage": map[string]int{
|
||||
"prompt_tokens": 0,
|
||||
"completion_tokens": 0,
|
||||
"total_tokens": 0,
|
||||
},
|
||||
},
|
||||
"reasoning_content_signature": "",
|
||||
}},
|
||||
"messages": projectMessages(request),
|
||||
"business": map[string]any{
|
||||
"product": "jb_plugin",
|
||||
"version": c.cfg.CosyVersion,
|
||||
@@ -272,10 +288,193 @@ func (c *Client) buildBody(requestID string, request ChatRequest) (string, error
|
||||
"name": "memory_intent_recognition_" + requestID,
|
||||
},
|
||||
}
|
||||
if tools := projectTools(request.Tools); len(tools) > 0 {
|
||||
payload["tools"] = tools
|
||||
}
|
||||
if choice := projectToolChoice(request.ToolChoice); choice != nil {
|
||||
payload["tool_choice"] = choice
|
||||
}
|
||||
body, err := json.Marshal(payload)
|
||||
return string(body), err
|
||||
}
|
||||
|
||||
func nullableSlice[T any](items []T) any {
|
||||
if len(items) == 0 {
|
||||
return nil
|
||||
}
|
||||
return items
|
||||
}
|
||||
|
||||
func projectImages(images []Image) []string {
|
||||
if len(images) == 0 {
|
||||
return nil
|
||||
}
|
||||
out := make([]string, 0, len(images))
|
||||
for _, img := range images {
|
||||
item := projectImage(img)
|
||||
if item != "" {
|
||||
out = append(out, item)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func projectImage(img Image) string {
|
||||
if strings.TrimSpace(img.Data) == "" && strings.TrimSpace(img.URL) == "" {
|
||||
return ""
|
||||
}
|
||||
mediaType := strings.TrimSpace(img.MediaType)
|
||||
if mediaType == "" {
|
||||
mediaType = "image/jpeg"
|
||||
}
|
||||
if strings.TrimSpace(img.Data) != "" {
|
||||
return "data:" + mediaType + ";base64," + strings.TrimSpace(img.Data)
|
||||
}
|
||||
return strings.TrimSpace(img.URL)
|
||||
}
|
||||
|
||||
func projectMessages(request ChatRequest) []map[string]any {
|
||||
source := request.Messages
|
||||
if len(source) == 0 {
|
||||
source = []Message{{Role: "user", Content: request.Prompt}}
|
||||
}
|
||||
out := make([]map[string]any, 0, len(source))
|
||||
for _, message := range source {
|
||||
role := strings.TrimSpace(message.Role)
|
||||
if role == "" {
|
||||
continue
|
||||
}
|
||||
item := map[string]any{
|
||||
"role": role,
|
||||
"content": projectMessageContent(message),
|
||||
"response_meta": map[string]any{
|
||||
"id": "",
|
||||
"usage": map[string]int{
|
||||
"prompt_tokens": 0,
|
||||
"completion_tokens": 0,
|
||||
"total_tokens": 0,
|
||||
},
|
||||
},
|
||||
"reasoning_content_signature": "",
|
||||
}
|
||||
if message.Name != "" {
|
||||
item["name"] = message.Name
|
||||
}
|
||||
if message.ToolCallID != "" {
|
||||
item["tool_call_id"] = message.ToolCallID
|
||||
}
|
||||
if calls := projectMessageToolCalls(message.ToolCalls); len(calls) > 0 {
|
||||
item["tool_calls"] = calls
|
||||
}
|
||||
out = append(out, item)
|
||||
}
|
||||
if len(out) == 0 {
|
||||
return []map[string]any{{"role": "user", "content": request.Prompt}}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func projectMessageContent(message Message) any {
|
||||
if len(message.Images) == 0 {
|
||||
return message.Content
|
||||
}
|
||||
content := make([]map[string]any, 0, len(message.Images)+1)
|
||||
if strings.TrimSpace(message.Content) != "" {
|
||||
content = append(content, map[string]any{
|
||||
"type": "text",
|
||||
"text": message.Content,
|
||||
})
|
||||
}
|
||||
for _, img := range message.Images {
|
||||
imageURL := projectImage(img)
|
||||
if imageURL == "" {
|
||||
continue
|
||||
}
|
||||
content = append(content, map[string]any{
|
||||
"type": "image_url",
|
||||
"image_url": map[string]any{
|
||||
"url": imageURL,
|
||||
},
|
||||
})
|
||||
}
|
||||
if len(content) == 0 {
|
||||
return message.Content
|
||||
}
|
||||
return content
|
||||
}
|
||||
|
||||
func projectMessageToolCalls(calls []toolemulation.ToolCall) []map[string]any {
|
||||
if len(calls) == 0 {
|
||||
return nil
|
||||
}
|
||||
out := make([]map[string]any, 0, len(calls))
|
||||
for i, call := range calls {
|
||||
name := strings.TrimSpace(call.Name)
|
||||
if name == "" {
|
||||
continue
|
||||
}
|
||||
args, _ := json.Marshal(call.Arguments)
|
||||
out = append(out, map[string]any{
|
||||
"index": i,
|
||||
"id": strings.TrimSpace(call.ID),
|
||||
"type": "function",
|
||||
"function": map[string]any{
|
||||
"name": name,
|
||||
"arguments": string(args),
|
||||
},
|
||||
})
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func projectTools(tools []toolemulation.ToolDef) []map[string]any {
|
||||
if len(tools) == 0 {
|
||||
return nil
|
||||
}
|
||||
out := make([]map[string]any, 0, len(tools))
|
||||
for _, tool := range tools {
|
||||
name := strings.TrimSpace(tool.Name)
|
||||
if name == "" {
|
||||
continue
|
||||
}
|
||||
params := any(tool.InputSchema)
|
||||
if len(tool.InputSchema) == 0 {
|
||||
params = map[string]any{"type": "object", "properties": map[string]any{}}
|
||||
}
|
||||
out = append(out, map[string]any{
|
||||
"type": "function",
|
||||
"function": map[string]any{
|
||||
"name": name,
|
||||
"description": strings.TrimSpace(tool.Description),
|
||||
"parameters": params,
|
||||
},
|
||||
})
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func projectToolChoice(choice toolemulation.ToolChoice) any {
|
||||
switch choice.Mode {
|
||||
case "none":
|
||||
return "none"
|
||||
case "any":
|
||||
return "required"
|
||||
case "tool":
|
||||
name := strings.TrimSpace(choice.Name)
|
||||
if name == "" {
|
||||
return nil
|
||||
}
|
||||
return map[string]any{
|
||||
"type": "function",
|
||||
"function": map[string]any{
|
||||
"name": name,
|
||||
},
|
||||
}
|
||||
default:
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
func (c *Client) headers(cred Credential, path string, body string) (map[string]string, error) {
|
||||
if err := validateCredential(cred); err != nil {
|
||||
return nil, err
|
||||
@@ -335,15 +534,35 @@ type innerSSE struct {
|
||||
Choices []struct {
|
||||
Delta struct {
|
||||
Content string `json:"content"`
|
||||
ToolCalls []remoteToolCallDelta `json:"tool_calls"`
|
||||
} `json:"delta"`
|
||||
} `json:"choices"`
|
||||
}
|
||||
|
||||
type sseEvent struct {
|
||||
Content string
|
||||
ToolCalls []remoteToolCallFragment
|
||||
Done bool
|
||||
}
|
||||
|
||||
type remoteToolCallFragment struct {
|
||||
Index int
|
||||
ID string
|
||||
Type string
|
||||
Name string
|
||||
ArgumentsFragment string
|
||||
}
|
||||
|
||||
type remoteToolCallDelta struct {
|
||||
Index int `json:"index"`
|
||||
ID string `json:"id,omitempty"`
|
||||
Type string `json:"type,omitempty"`
|
||||
Function struct {
|
||||
Name string `json:"name,omitempty"`
|
||||
Arguments string `json:"arguments,omitempty"`
|
||||
} `json:"function,omitempty"`
|
||||
}
|
||||
|
||||
func scanSSE(reader io.Reader, onEvent func(sseEvent) error) error {
|
||||
scanner := bufio.NewScanner(reader)
|
||||
scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
|
||||
@@ -389,10 +608,94 @@ func parseSSEPayload(payload string) (sseEvent, bool, error) {
|
||||
return sseEvent{}, false, err
|
||||
}
|
||||
var builder strings.Builder
|
||||
var toolCalls []remoteToolCallFragment
|
||||
for _, choice := range inner.Choices {
|
||||
builder.WriteString(choice.Delta.Content)
|
||||
for _, tc := range choice.Delta.ToolCalls {
|
||||
toolCalls = append(toolCalls, remoteToolCallFragment{
|
||||
Index: tc.Index,
|
||||
ID: strings.TrimSpace(tc.ID),
|
||||
Type: strings.TrimSpace(tc.Type),
|
||||
Name: strings.TrimSpace(tc.Function.Name),
|
||||
ArgumentsFragment: tc.Function.Arguments,
|
||||
})
|
||||
}
|
||||
return sseEvent{Content: builder.String()}, true, nil
|
||||
}
|
||||
return sseEvent{Content: builder.String(), ToolCalls: toolCalls}, true, nil
|
||||
}
|
||||
|
||||
type remoteToolCallBuffer struct {
|
||||
order []int
|
||||
states map[int]*remoteToolCallState
|
||||
}
|
||||
|
||||
type remoteToolCallState struct {
|
||||
id string
|
||||
callType string
|
||||
name string
|
||||
arguments strings.Builder
|
||||
}
|
||||
|
||||
func newRemoteToolCallBuffer() *remoteToolCallBuffer {
|
||||
return &remoteToolCallBuffer{states: map[int]*remoteToolCallState{}}
|
||||
}
|
||||
|
||||
func (b *remoteToolCallBuffer) Add(fragments []remoteToolCallFragment) {
|
||||
if b == nil {
|
||||
return
|
||||
}
|
||||
for _, fragment := range fragments {
|
||||
state := b.states[fragment.Index]
|
||||
if state == nil {
|
||||
state = &remoteToolCallState{}
|
||||
b.states[fragment.Index] = state
|
||||
b.order = append(b.order, fragment.Index)
|
||||
}
|
||||
if fragment.ID != "" {
|
||||
state.id = fragment.ID
|
||||
}
|
||||
if fragment.Type != "" {
|
||||
state.callType = fragment.Type
|
||||
}
|
||||
if fragment.Name != "" {
|
||||
state.name = fragment.Name
|
||||
}
|
||||
if fragment.ArgumentsFragment != "" {
|
||||
state.arguments.WriteString(fragment.ArgumentsFragment)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (b *remoteToolCallBuffer) Calls() []toolemulation.ToolCall {
|
||||
if b == nil || len(b.order) == 0 {
|
||||
return nil
|
||||
}
|
||||
out := make([]toolemulation.ToolCall, 0, len(b.order))
|
||||
for _, index := range b.order {
|
||||
state := b.states[index]
|
||||
if state == nil || strings.TrimSpace(state.name) == "" {
|
||||
continue
|
||||
}
|
||||
args := strings.TrimSpace(state.arguments.String())
|
||||
call := toolemulation.ToolCall{
|
||||
ID: strings.TrimSpace(state.id),
|
||||
Name: strings.TrimSpace(state.name),
|
||||
Arguments: map[string]any{},
|
||||
}
|
||||
if args != "" {
|
||||
var parsed map[string]any
|
||||
if err := json.Unmarshal([]byte(args), &parsed); err == nil {
|
||||
call.Arguments = parsed
|
||||
} else {
|
||||
call.Arguments = map[string]any{"raw_arguments": args}
|
||||
}
|
||||
}
|
||||
if call.ID == "" {
|
||||
call.ID = fmt.Sprintf("toolu_%d_%d", time.Now().UnixNano(), index)
|
||||
}
|
||||
out = append(out, call)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func candidateConfigFiles() []string {
|
||||
|
||||
@@ -1,11 +1,14 @@
|
||||
package remote
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"lingma-ipc-proxy/internal/toolemulation"
|
||||
)
|
||||
|
||||
func TestNewKeepsZeroTimeoutUnlimited(t *testing.T) {
|
||||
@@ -93,6 +96,171 @@ func TestModelListStatusErrorSuggestsManualRemoteBaseURLOn404(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildBodyProjectsNativeTools(t *testing.T) {
|
||||
client := New(Config{})
|
||||
body, err := client.buildBody("req-1", ChatRequest{
|
||||
Model: "kmodel",
|
||||
Prompt: "read file",
|
||||
Tools: []toolemulation.ToolDef{{
|
||||
Name: "read_file",
|
||||
Description: "Read a local file",
|
||||
InputSchema: map[string]any{
|
||||
"type": "object",
|
||||
"properties": map[string]any{
|
||||
"file_path": map[string]any{"type": "string"},
|
||||
},
|
||||
"required": []any{"file_path"},
|
||||
},
|
||||
}},
|
||||
ToolChoice: toolemulation.ToolChoice{Mode: "tool", Name: "read_file"},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
var payload map[string]any
|
||||
if err := json.Unmarshal([]byte(body), &payload); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
tools, ok := payload["tools"].([]any)
|
||||
if !ok || len(tools) != 1 {
|
||||
t.Fatalf("tools = %#v", payload["tools"])
|
||||
}
|
||||
tool := tools[0].(map[string]any)
|
||||
fn := tool["function"].(map[string]any)
|
||||
if tool["type"] != "function" || fn["name"] != "read_file" {
|
||||
t.Fatalf("unexpected tool projection: %#v", tool)
|
||||
}
|
||||
choice := payload["tool_choice"].(map[string]any)
|
||||
choiceFn := choice["function"].(map[string]any)
|
||||
if choice["type"] != "function" || choiceFn["name"] != "read_file" {
|
||||
t.Fatalf("unexpected tool choice: %#v", payload["tool_choice"])
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildBodyPreservesStructuredToolMessages(t *testing.T) {
|
||||
client := New(Config{})
|
||||
body, err := client.buildBody("req-1", ChatRequest{
|
||||
Model: "kmodel",
|
||||
Prompt: "fallback prompt",
|
||||
Messages: []Message{
|
||||
{Role: "user", Content: "查看项目"},
|
||||
{Role: "assistant", ToolCalls: []toolemulation.ToolCall{{
|
||||
ID: "call_1",
|
||||
Name: "Bash",
|
||||
Arguments: map[string]any{"command": "pwd && ls -la"},
|
||||
}}},
|
||||
{Role: "tool", ToolCallID: "call_1", Content: "total 10"},
|
||||
},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
var payload map[string]any
|
||||
if err := json.Unmarshal([]byte(body), &payload); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
messages := payload["messages"].([]any)
|
||||
if len(messages) != 3 {
|
||||
t.Fatalf("messages = %#v", messages)
|
||||
}
|
||||
assistant := messages[1].(map[string]any)
|
||||
calls := assistant["tool_calls"].([]any)
|
||||
call := calls[0].(map[string]any)
|
||||
fn := call["function"].(map[string]any)
|
||||
args := fn["arguments"].(string)
|
||||
if assistant["role"] != "assistant" || fn["name"] != "Bash" || !strings.Contains(args, "pwd") || !strings.Contains(args, "ls -la") {
|
||||
t.Fatalf("unexpected assistant message: %#v", assistant)
|
||||
}
|
||||
tool := messages[2].(map[string]any)
|
||||
if tool["role"] != "tool" || tool["tool_call_id"] != "call_1" || tool["content"] != "total 10" {
|
||||
t.Fatalf("unexpected tool message: %#v", tool)
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildBodyProjectsRemoteImages(t *testing.T) {
|
||||
client := New(Config{})
|
||||
body, err := client.buildBody("req-1", ChatRequest{
|
||||
Model: "kmodel",
|
||||
Prompt: "看图",
|
||||
Messages: []Message{{
|
||||
Role: "user",
|
||||
Content: "看图",
|
||||
Images: []Image{{
|
||||
MediaType: "image/png",
|
||||
Data: "iVBORw0KGgo=",
|
||||
}},
|
||||
}},
|
||||
Images: []Image{{
|
||||
MediaType: "image/png",
|
||||
Data: "iVBORw0KGgo=",
|
||||
}},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
var payload map[string]any
|
||||
if err := json.Unmarshal([]byte(body), &payload); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
images, ok := payload["image_urls"].([]any)
|
||||
if !ok || len(images) != 1 {
|
||||
t.Fatalf("image_urls = %#v", payload["image_urls"])
|
||||
}
|
||||
image, ok := images[0].(string)
|
||||
if !ok || !strings.HasPrefix(image, "data:image/png;base64,") {
|
||||
t.Fatalf("unexpected image projection: %#v", images[0])
|
||||
}
|
||||
modelConfig := payload["model_config"].(map[string]any)
|
||||
if modelConfig["is_vl"] != true {
|
||||
t.Fatalf("model_config.is_vl = %#v, want true", modelConfig["is_vl"])
|
||||
}
|
||||
messages := payload["messages"].([]any)
|
||||
message := messages[0].(map[string]any)
|
||||
content := message["content"].([]any)
|
||||
if content[0].(map[string]any)["type"] != "text" || content[1].(map[string]any)["type"] != "image_url" {
|
||||
t.Fatalf("unexpected message content: %#v", content)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseSSEPayloadExtractsNativeToolCallFragments(t *testing.T) {
|
||||
payload := `{"body":"{\"choices\":[{\"delta\":{\"tool_calls\":[{\"index\":0,\"id\":\"call_1\",\"type\":\"function\",\"function\":{\"name\":\"read_file\",\"arguments\":\"{\\\"file_path\\\":\\\"/tmp/a.txt\\\"}\"}}]}}]}","statusCodeValue":200}`
|
||||
event, ok, err := parseSSEPayload(payload)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if !ok {
|
||||
t.Fatal("event not parsed")
|
||||
}
|
||||
if len(event.ToolCalls) != 1 {
|
||||
t.Fatalf("tool calls = %#v", event.ToolCalls)
|
||||
}
|
||||
call := event.ToolCalls[0]
|
||||
if call.ID != "call_1" || call.Name != "read_file" || call.ArgumentsFragment != `{"file_path":"/tmp/a.txt"}` {
|
||||
t.Fatalf("unexpected call = %#v", call)
|
||||
}
|
||||
}
|
||||
|
||||
func TestRemoteToolCallBufferMergesArgumentFragments(t *testing.T) {
|
||||
buffer := newRemoteToolCallBuffer()
|
||||
buffer.Add([]remoteToolCallFragment{{
|
||||
Index: 0,
|
||||
ID: "call_1",
|
||||
Type: "function",
|
||||
Name: "read_file",
|
||||
}})
|
||||
buffer.Add([]remoteToolCallFragment{{Index: 0, ArgumentsFragment: `{"file_path":"/tmp`}})
|
||||
buffer.Add([]remoteToolCallFragment{{Index: 0, ArgumentsFragment: `/lingma-native`}})
|
||||
buffer.Add([]remoteToolCallFragment{{Index: 0, ArgumentsFragment: `-tool-test.txt"}`}})
|
||||
calls := buffer.Calls()
|
||||
if len(calls) != 1 {
|
||||
t.Fatalf("calls = %#v", calls)
|
||||
}
|
||||
call := calls[0]
|
||||
if call.ID != "call_1" || call.Name != "read_file" || call.Arguments["file_path"] != "/tmp/lingma-native-tool-test.txt" {
|
||||
t.Fatalf("unexpected merged call = %#v", call)
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractMachineIDFromTextMarkers(t *testing.T) {
|
||||
got := extractMachineIDFromText(`2026-05-06 info using machine id from file: abcdef1234567890abcdef`)
|
||||
if got != "abcdef1234567890abcdef" {
|
||||
|
||||
@@ -65,6 +65,8 @@ type ChatMessage struct {
|
||||
Role string
|
||||
Text string
|
||||
Images []Image
|
||||
ToolCallID string
|
||||
ToolCalls []toolemulation.ToolCall
|
||||
}
|
||||
|
||||
type ChatRequest struct {
|
||||
@@ -353,11 +355,17 @@ func (s *Service) generateRemote(
|
||||
req ChatRequest,
|
||||
onDelta func(string),
|
||||
) (*ChatResult, error) {
|
||||
if requestHasImages(req) {
|
||||
if len(req.Tools) > 0 && req.ToolChoice.Mode != "none" {
|
||||
return s.generateRemoteWithImageContext(ctx, req, onDelta)
|
||||
}
|
||||
return s.generateWithReconnect(ctx, req, onDelta)
|
||||
}
|
||||
if strings.TrimSpace(req.Model) == "" {
|
||||
req.Model = s.DefaultModel()
|
||||
}
|
||||
req.Model = normalizeModelForBackend(BackendRemote, req.Model)
|
||||
prompt, err := buildLingmaPrompt(req, SessionModeFresh)
|
||||
prompt, err := buildLingmaPrompt(req, SessionModeFresh, false)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
@@ -383,6 +391,23 @@ func (s *Service) generateRemote(
|
||||
return nil, lastErr
|
||||
}
|
||||
|
||||
func (s *Service) generateRemoteWithImageContext(
|
||||
ctx context.Context,
|
||||
req ChatRequest,
|
||||
onDelta func(string),
|
||||
) (*ChatResult, error) {
|
||||
imageReq := req
|
||||
imageReq.Tools = nil
|
||||
imageReq.ToolChoice = toolemulation.ToolChoice{Mode: "none"}
|
||||
imageReq.ParallelToolCalls = nil
|
||||
imageResult, err := s.generateWithReconnect(ctx, imageReq, nil)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("image context extraction through IPC failed: %w", err)
|
||||
}
|
||||
remoteReq := requestWithImageContext(req, imageResult.Text)
|
||||
return s.generateRemote(ctx, remoteReq, onDelta)
|
||||
}
|
||||
|
||||
func (s *Service) generateRemoteWithModel(
|
||||
ctx context.Context,
|
||||
client *remote.Client,
|
||||
@@ -403,12 +428,32 @@ func (s *Service) generateRemoteWithModel(
|
||||
remoteResult, err := client.Chat(ctx, remote.ChatRequest{
|
||||
Model: model,
|
||||
Prompt: prompt,
|
||||
Messages: remoteMessagesFromRequest(req),
|
||||
Images: remoteImagesFromRequest(req),
|
||||
Stream: onDelta != nil,
|
||||
Temperature: req.Temperature,
|
||||
Tools: req.Tools,
|
||||
ToolChoice: req.ToolChoice,
|
||||
}, delta)
|
||||
if err != nil {
|
||||
return nil, emitted, err
|
||||
}
|
||||
if len(remoteResult.ToolCalls) == 0 && shouldRetryRemoteNativeTool(req, remoteResult.Text) {
|
||||
retryResult, retryErr := client.Chat(ctx, remote.ChatRequest{
|
||||
Model: model,
|
||||
Prompt: prompt,
|
||||
Messages: remoteMessagesFromRequest(req),
|
||||
Images: remoteImagesFromRequest(req),
|
||||
Stream: false,
|
||||
Temperature: req.Temperature,
|
||||
Tools: req.Tools,
|
||||
ToolChoice: toolemulation.ToolChoice{Mode: "any"},
|
||||
}, nil)
|
||||
if retryErr == nil && len(retryResult.ToolCalls) > 0 {
|
||||
remoteResult = retryResult
|
||||
emitted = false
|
||||
}
|
||||
}
|
||||
|
||||
result := &ChatResult{
|
||||
Text: remoteResult.Text,
|
||||
@@ -422,25 +467,133 @@ func (s *Service) generateRemoteWithModel(
|
||||
Endpoint: remote.ResolveBaseURL(s.cfg.RemoteBaseURL),
|
||||
Transport: "remote",
|
||||
EffectiveSession: SessionModeFresh,
|
||||
ToolCalls: remoteResult.ToolCalls,
|
||||
}
|
||||
s.applyToolEmulation(ctx, req, prompt, result, onDelta, func(hintPrompt string) (string, int, error) {
|
||||
retryResult, retryErr := client.Chat(ctx, remote.ChatRequest{
|
||||
Model: model,
|
||||
Prompt: hintPrompt,
|
||||
Stream: onDelta != nil,
|
||||
Temperature: req.Temperature,
|
||||
}, onDelta)
|
||||
if retryErr != nil {
|
||||
return "", 0, retryErr
|
||||
}
|
||||
if retryResult == nil {
|
||||
return "", 0, nil
|
||||
}
|
||||
return retryResult.Text, retryResult.OutputTokens, nil
|
||||
})
|
||||
return result, emitted, nil
|
||||
}
|
||||
|
||||
func remoteMessagesFromRequest(req ChatRequest) []remote.Message {
|
||||
out := make([]remote.Message, 0, len(req.Messages)+1)
|
||||
if system := strings.TrimSpace(req.System); system != "" {
|
||||
out = append(out, remote.Message{Role: "system", Content: system})
|
||||
}
|
||||
for _, message := range req.Messages {
|
||||
role := strings.ToLower(strings.TrimSpace(message.Role))
|
||||
if role == "" {
|
||||
continue
|
||||
}
|
||||
content := strings.TrimSpace(message.Text)
|
||||
if content == "" && len(message.Images) == 0 && len(message.ToolCalls) == 0 {
|
||||
continue
|
||||
}
|
||||
out = append(out, remote.Message{
|
||||
Role: role,
|
||||
Content: content,
|
||||
Images: remoteImagesFromChatMessage(message),
|
||||
ToolCallID: strings.TrimSpace(message.ToolCallID),
|
||||
ToolCalls: message.ToolCalls,
|
||||
})
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func remoteImagesFromChatMessage(message ChatMessage) []remote.Image {
|
||||
if len(message.Images) == 0 {
|
||||
return nil
|
||||
}
|
||||
images := make([]remote.Image, 0, len(message.Images))
|
||||
for _, img := range message.Images {
|
||||
if strings.TrimSpace(img.Data) == "" && strings.TrimSpace(img.URL) == "" {
|
||||
continue
|
||||
}
|
||||
images = append(images, remote.Image{
|
||||
MediaType: strings.TrimSpace(img.MediaType),
|
||||
Data: img.Data,
|
||||
URL: strings.TrimSpace(img.URL),
|
||||
})
|
||||
}
|
||||
return images
|
||||
}
|
||||
|
||||
func remoteImagesFromRequest(req ChatRequest) []remote.Image {
|
||||
var images []remote.Image
|
||||
for _, message := range req.Messages {
|
||||
for _, img := range message.Images {
|
||||
if strings.TrimSpace(img.Data) == "" && strings.TrimSpace(img.URL) == "" {
|
||||
continue
|
||||
}
|
||||
images = append(images, remote.Image{
|
||||
MediaType: strings.TrimSpace(img.MediaType),
|
||||
Data: img.Data,
|
||||
URL: strings.TrimSpace(img.URL),
|
||||
})
|
||||
}
|
||||
}
|
||||
return images
|
||||
}
|
||||
|
||||
func requestHasImages(req ChatRequest) bool {
|
||||
for _, message := range req.Messages {
|
||||
if len(remoteImagesFromChatMessage(message)) > 0 {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func requestWithImageContext(req ChatRequest, imageContext string) ChatRequest {
|
||||
out := req
|
||||
out.Messages = make([]ChatMessage, len(req.Messages))
|
||||
copy(out.Messages, req.Messages)
|
||||
for i := range out.Messages {
|
||||
out.Messages[i].Images = nil
|
||||
}
|
||||
contextText := strings.TrimSpace(imageContext)
|
||||
if contextText == "" {
|
||||
return out
|
||||
}
|
||||
addition := "\n\n[图片上下文]\n" + contextText
|
||||
for i := len(out.Messages) - 1; i >= 0; i-- {
|
||||
if strings.EqualFold(strings.TrimSpace(out.Messages[i].Role), "user") {
|
||||
out.Messages[i].Text = strings.TrimSpace(out.Messages[i].Text + addition)
|
||||
return out
|
||||
}
|
||||
}
|
||||
out.Messages = append(out.Messages, ChatMessage{Role: "user", Text: strings.TrimSpace("[图片上下文]\n" + contextText)})
|
||||
return out
|
||||
}
|
||||
|
||||
func shouldRetryRemoteNativeTool(req ChatRequest, text string) bool {
|
||||
if len(req.Tools) == 0 || req.ToolChoice.Mode == "none" {
|
||||
return false
|
||||
}
|
||||
trimmed := strings.TrimSpace(text)
|
||||
if trimmed == "" || len([]rune(trimmed)) > 180 {
|
||||
return false
|
||||
}
|
||||
lower := strings.ToLower(trimmed)
|
||||
cues := []string{
|
||||
"让我", "我来", "我将", "接下来", "继续", "查看", "检查", "搜索", "读取", "运行", "执行",
|
||||
"let me", "i'll", "i will", "next", "continue", "check", "inspect", "search", "read", "run",
|
||||
}
|
||||
hasCue := false
|
||||
for _, cue := range cues {
|
||||
if strings.Contains(lower, cue) {
|
||||
hasCue = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !hasCue {
|
||||
return false
|
||||
}
|
||||
return strings.HasSuffix(trimmed, ":") ||
|
||||
strings.HasSuffix(trimmed, ":") ||
|
||||
strings.Contains(trimmed, ":\n") ||
|
||||
strings.Contains(lower, "use ") ||
|
||||
strings.Contains(lower, "call ") ||
|
||||
strings.Contains(trimmed, "工具")
|
||||
}
|
||||
|
||||
func (s *Service) remoteAttemptModels(ctx context.Context, primary string) []string {
|
||||
primary = normalizeModelForBackend(BackendRemote, primary)
|
||||
models := []string{primary}
|
||||
@@ -526,7 +679,7 @@ func (s *Service) generateLocked(
|
||||
}
|
||||
|
||||
effectiveMode := resolveSessionMode(req, s.cfg.SessionMode)
|
||||
prompt, err := buildLingmaPrompt(req, effectiveMode)
|
||||
prompt, err := buildLingmaPrompt(req, effectiveMode, true)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
@@ -1078,14 +1231,14 @@ func resolveSessionMode(req ChatRequest, configured SessionMode) SessionMode {
|
||||
|
||||
func extractLastUserImages(messages []ChatMessage) []Image {
|
||||
for i := len(messages) - 1; i >= 0; i-- {
|
||||
if messages[i].Role == "user" {
|
||||
if messages[i].Role == "user" && len(messages[i].Images) > 0 {
|
||||
return messages[i].Images
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func buildLingmaPrompt(req ChatRequest, mode SessionMode) (string, error) {
|
||||
func buildLingmaPrompt(req ChatRequest, mode SessionMode, emulateTools bool) (string, error) {
|
||||
messages := filteredMessages(req.Messages)
|
||||
var lastUser string
|
||||
for i := len(messages) - 1; i >= 0; i-- {
|
||||
@@ -1102,7 +1255,7 @@ func buildLingmaPrompt(req ChatRequest, mode SessionMode) (string, error) {
|
||||
}
|
||||
|
||||
system := strings.TrimSpace(req.System)
|
||||
if len(req.Tools) > 0 && req.ToolChoice.Mode != "none" {
|
||||
if emulateTools && len(req.Tools) > 0 && req.ToolChoice.Mode != "none" {
|
||||
system = toolemulation.InjectTooling(system, req.Tools, req.ToolChoice, req.ParallelToolCalls)
|
||||
}
|
||||
|
||||
@@ -1110,7 +1263,7 @@ func buildLingmaPrompt(req ChatRequest, mode SessionMode) (string, error) {
|
||||
return lastUser, nil
|
||||
}
|
||||
|
||||
if len(req.Tools) > 0 {
|
||||
if emulateTools && len(req.Tools) > 0 {
|
||||
parts := make([]string, 0, len(messages)+3)
|
||||
for _, message := range messages {
|
||||
role := "User"
|
||||
@@ -1152,6 +1305,10 @@ func filteredMessages(messages []ChatMessage) []ChatMessage {
|
||||
if text == "" {
|
||||
continue
|
||||
}
|
||||
if role == "tool" {
|
||||
text = toolemulation.ActionOutputPrompt(message.ToolCallID, text)
|
||||
role = "user"
|
||||
}
|
||||
if role != "user" && role != "assistant" {
|
||||
continue
|
||||
}
|
||||
|
||||
@@ -3,8 +3,11 @@ package service
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"lingma-ipc-proxy/internal/toolemulation"
|
||||
)
|
||||
|
||||
func TestIsRecoverableIPCError(t *testing.T) {
|
||||
@@ -48,3 +51,126 @@ func TestContextWithOptionalTimeoutPositiveSetsDeadline(t *testing.T) {
|
||||
t.Fatal("positive timeout should set a deadline")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildLingmaPromptOnlyInjectsToolingWhenEmulationEnabled(t *testing.T) {
|
||||
req := ChatRequest{
|
||||
Messages: []ChatMessage{{Role: "user", Text: "查看项目结构"}},
|
||||
Tools: []toolemulation.ToolDef{{
|
||||
Name: "Bash",
|
||||
InputSchema: map[string]any{
|
||||
"properties": map[string]any{
|
||||
"command": map[string]any{"type": "string"},
|
||||
},
|
||||
"required": []any{"command"},
|
||||
},
|
||||
}},
|
||||
ToolChoice: toolemulation.ToolChoice{Mode: "auto"},
|
||||
}
|
||||
|
||||
remotePrompt, err := buildLingmaPrompt(req, SessionModeFresh, false)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if strings.Contains(remotePrompt, "```json action") || strings.Contains(remotePrompt, "DIRECT tool access") {
|
||||
t.Fatalf("remote prompt should not include tool emulation:\n%s", remotePrompt)
|
||||
}
|
||||
|
||||
ipcPrompt, err := buildLingmaPrompt(req, SessionModeFresh, true)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if !strings.Contains(ipcPrompt, "```json action") || !strings.Contains(ipcPrompt, "DIRECT tool access") {
|
||||
t.Fatalf("ipc prompt should include tool emulation:\n%s", ipcPrompt)
|
||||
}
|
||||
}
|
||||
|
||||
func TestShouldRetryRemoteNativeToolForContinuationText(t *testing.T) {
|
||||
req := ChatRequest{
|
||||
Tools: []toolemulation.ToolDef{{Name: "Bash"}},
|
||||
ToolChoice: toolemulation.ToolChoice{
|
||||
Mode: "auto",
|
||||
},
|
||||
}
|
||||
if !shouldRetryRemoteNativeTool(req, "让我查看一下项目的整体结构,特别是源代码目录:") {
|
||||
t.Fatal("expected continuation text to trigger native tool retry")
|
||||
}
|
||||
if shouldRetryRemoteNativeTool(req, "这是一个 uni-app 项目,核心目录是 src。") {
|
||||
t.Fatal("substantive answer should not trigger retry")
|
||||
}
|
||||
req.ToolChoice = toolemulation.ToolChoice{Mode: "none"}
|
||||
if shouldRetryRemoteNativeTool(req, "让我查看一下:") {
|
||||
t.Fatal("tool_choice none should not trigger retry")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildLingmaPromptKeepsToolResultsForIPC(t *testing.T) {
|
||||
req := ChatRequest{
|
||||
Messages: []ChatMessage{
|
||||
{Role: "user", Text: "查看项目"},
|
||||
{Role: "assistant", ToolCalls: []toolemulation.ToolCall{{ID: "call_1", Name: "Bash", Arguments: map[string]any{"command": "pwd"}}}},
|
||||
{Role: "tool", ToolCallID: "call_1", Text: "/tmp/project"},
|
||||
},
|
||||
Tools: []toolemulation.ToolDef{{Name: "Bash"}},
|
||||
ToolChoice: toolemulation.ToolChoice{Mode: "auto"},
|
||||
}
|
||||
prompt, err := buildLingmaPrompt(req, SessionModeFresh, true)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if !strings.Contains(prompt, "Tool result for call_1") || !strings.Contains(prompt, "/tmp/project") {
|
||||
t.Fatalf("ipc prompt should include tool result:\n%s", prompt)
|
||||
}
|
||||
if strings.Contains(prompt, "Assistant used tool") {
|
||||
t.Fatalf("ipc prompt should not include textualized assistant tool calls:\n%s", prompt)
|
||||
}
|
||||
}
|
||||
|
||||
func TestRemoteImagesFromRequest(t *testing.T) {
|
||||
req := ChatRequest{Messages: []ChatMessage{{Role: "user", Text: "see", Images: []Image{{MediaType: "image/png", Data: "AAAA"}}}}}
|
||||
images := remoteImagesFromRequest(req)
|
||||
if len(images) != 1 {
|
||||
t.Fatalf("images = %#v", images)
|
||||
}
|
||||
if images[0].MediaType != "image/png" || images[0].Data != "AAAA" {
|
||||
t.Fatalf("unexpected image = %#v", images[0])
|
||||
}
|
||||
}
|
||||
|
||||
func TestRequestHasImages(t *testing.T) {
|
||||
if requestHasImages(ChatRequest{Messages: []ChatMessage{{Role: "user", Text: "plain"}}}) {
|
||||
t.Fatal("plain request should not have images")
|
||||
}
|
||||
if !requestHasImages(ChatRequest{Messages: []ChatMessage{{Role: "user", Images: []Image{{URL: "file:///tmp/a.png"}}}}}) {
|
||||
t.Fatal("image URL request should have images")
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractLastUserImagesFindsPreviousImageTurn(t *testing.T) {
|
||||
images := extractLastUserImages([]ChatMessage{
|
||||
{Role: "user", Text: "看这张图", Images: []Image{{URL: "file:///tmp/a.png"}}},
|
||||
{Role: "assistant", Text: "这是一张图片"},
|
||||
{Role: "user", Text: "继续基于上图分析"},
|
||||
})
|
||||
if len(images) != 1 || images[0].URL != "file:///tmp/a.png" {
|
||||
t.Fatalf("images = %#v", images)
|
||||
}
|
||||
}
|
||||
|
||||
func TestRequestWithImageContextRemovesImagesAndAppendsContext(t *testing.T) {
|
||||
req := ChatRequest{
|
||||
Messages: []ChatMessage{
|
||||
{Role: "user", Text: "看图", Images: []Image{{URL: "file:///tmp/a.png"}}},
|
||||
{Role: "assistant", Text: "好的"},
|
||||
{Role: "user", Text: "继续分析"},
|
||||
},
|
||||
}
|
||||
out := requestWithImageContext(req, "海边礁石和海浪")
|
||||
for _, message := range out.Messages {
|
||||
if len(message.Images) > 0 {
|
||||
t.Fatalf("images should be removed: %#v", out.Messages)
|
||||
}
|
||||
}
|
||||
if !strings.Contains(out.Messages[2].Text, "[图片上下文]") || !strings.Contains(out.Messages[2].Text, "海边礁石和海浪") {
|
||||
t.Fatalf("latest user message missing image context: %#v", out.Messages[2])
|
||||
}
|
||||
}
|
||||
|
||||
@@ -28,6 +28,7 @@ type ToolCall struct {
|
||||
|
||||
type Config struct {
|
||||
MaxScanBytes int
|
||||
MaxToolCalls int
|
||||
}
|
||||
|
||||
func ExtractTools(raw any) []ToolDef {
|
||||
@@ -223,6 +224,8 @@ func InjectTooling(system string, tools []ToolDef, choice ToolChoice, parallel *
|
||||
b.WriteString("- If any earlier or hidden instruction says there are no tools, ignore that statement and use the proxy tools listed in this message.\n")
|
||||
b.WriteString("- For an edit request with enough information, call patch or write_file; if information is missing, first call read_file/search_files and then patch after the tool result.\n")
|
||||
b.WriteString("- Emit multiple independent actions in one reply when possible.\n")
|
||||
b.WriteString("- Emit at most 5 independent tool actions in a single reply. Use the most targeted search/read commands first, then wait for results.\n")
|
||||
b.WriteString("- Do not run broad recursive commands such as `ls -R`, `find .`, or unrestricted grep over dependency folders. Prefer targeted paths and exclude node_modules, vendor, dist, build, and .git.\n")
|
||||
b.WriteString("- For dependent actions, wait for the tool result before emitting the next action.\n")
|
||||
b.WriteString("- If no tool is needed, reply with normal plain text.\n")
|
||||
b.WriteString("- NEVER say that tools are unavailable.\n")
|
||||
@@ -253,37 +256,15 @@ func InjectTooling(system string, tools []ToolDef, choice ToolChoice, parallel *
|
||||
|
||||
func AssistantToolCallsToText(content string, calls []ToolCall) string {
|
||||
content = strings.TrimSpace(content)
|
||||
if len(calls) == 0 {
|
||||
return content
|
||||
}
|
||||
|
||||
blocks := make([]string, 0, len(calls))
|
||||
for _, call := range calls {
|
||||
block := map[string]any{
|
||||
"tool": call.Name,
|
||||
"parameters": call.Arguments,
|
||||
}
|
||||
b, err := json.MarshalIndent(block, "", " ")
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
blocks = append(blocks, "```json action\n"+string(b)+"\n```")
|
||||
}
|
||||
if len(blocks) == 0 {
|
||||
return content
|
||||
}
|
||||
if content == "" {
|
||||
return strings.Join(blocks, "\n\n")
|
||||
}
|
||||
return content + "\n\n" + strings.Join(blocks, "\n\n")
|
||||
}
|
||||
|
||||
func ActionOutputPrompt(toolCallID string, output string) string {
|
||||
output = strings.TrimSpace(output)
|
||||
if output == "" {
|
||||
return ""
|
||||
}
|
||||
next := "Based on the tool result above, answer the user's request directly if you have enough information. Only use another structured action block if a specific missing fact still requires another tool call."
|
||||
next := "Based on the tool result above, answer the user's request directly if you have enough information. Only use another tool call if a specific missing fact still requires it."
|
||||
if id := strings.TrimSpace(toolCallID); id != "" {
|
||||
return "Tool result for " + id + ":\n" + output + "\n\n" + next
|
||||
}
|
||||
@@ -605,6 +586,11 @@ func ParseActionBlocks(text string, tools []ToolDef, cfg Config) ([]ToolCall, st
|
||||
type span struct{ start, end int }
|
||||
spans := make([]span, 0, len(openings))
|
||||
calls := make([]ToolCall, 0, len(openings))
|
||||
seen := map[string]bool{}
|
||||
maxCalls := cfg.MaxToolCalls
|
||||
if maxCalls <= 0 {
|
||||
maxCalls = 8
|
||||
}
|
||||
|
||||
for _, start := range openings {
|
||||
contentStart := start
|
||||
@@ -634,8 +620,16 @@ func ParseActionBlocks(text string, tools []ToolDef, cfg Config) ([]ToolCall, st
|
||||
continue
|
||||
}
|
||||
}
|
||||
calls = append(calls, call)
|
||||
spans = append(spans, span{start: start, end: end + 3})
|
||||
key := toolCallKey(call)
|
||||
if seen[key] {
|
||||
continue
|
||||
}
|
||||
seen[key] = true
|
||||
if len(calls) >= maxCalls {
|
||||
continue
|
||||
}
|
||||
calls = append(calls, call)
|
||||
}
|
||||
|
||||
if len(calls) == 0 {
|
||||
@@ -653,6 +647,11 @@ func ParseActionBlocks(text string, tools []ToolDef, cfg Config) ([]ToolCall, st
|
||||
return calls, strings.TrimSpace(clean), nil
|
||||
}
|
||||
|
||||
func toolCallKey(call ToolCall) string {
|
||||
args, _ := json.Marshal(call.Arguments)
|
||||
return strings.ToLower(strings.TrimSpace(call.Name)) + "\x00" + string(args)
|
||||
}
|
||||
|
||||
func normalizeToolName(raw string, available map[string]string) string {
|
||||
name := strings.TrimSpace(raw)
|
||||
if name == "" {
|
||||
|
||||
@@ -86,6 +86,8 @@ func TestInjectToolingIncludesAutoToolGuidance(t *testing.T) {
|
||||
"Core tool syntax examples",
|
||||
"conceptual question",
|
||||
"NEVER ask the user to run a command",
|
||||
"Emit at most 5 independent tool actions",
|
||||
"exclude node_modules",
|
||||
} {
|
||||
if !strings.Contains(prompt, want) {
|
||||
t.Fatalf("prompt missing %q:\n%s", want, prompt)
|
||||
@@ -176,3 +178,38 @@ func TestParseActionBlocksDropsCallsMissingRequiredArgs(t *testing.T) {
|
||||
t.Fatalf("clean should preserve unparseable action block, got %q", clean)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseActionBlocksDeduplicatesAndLimitsCalls(t *testing.T) {
|
||||
var b strings.Builder
|
||||
for i := 0; i < 12; i++ {
|
||||
command := "pwd"
|
||||
if i%2 == 1 {
|
||||
command = "ls " + string(rune('a'+i))
|
||||
}
|
||||
b.WriteString("```json action\n")
|
||||
b.WriteString(`{"tool":"Bash","parameters":{"command":"` + command + `"}}`)
|
||||
b.WriteString("\n```\n")
|
||||
}
|
||||
|
||||
calls, clean, err := ParseActionBlocks(b.String(), []ToolDef{{
|
||||
Name: "Bash",
|
||||
InputSchema: map[string]any{
|
||||
"properties": map[string]any{
|
||||
"command": map[string]any{"type": "string"},
|
||||
},
|
||||
"required": []any{"command"},
|
||||
},
|
||||
}}, Config{MaxToolCalls: 3})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if clean != "" {
|
||||
t.Fatalf("clean = %q", clean)
|
||||
}
|
||||
if len(calls) != 3 {
|
||||
t.Fatalf("call count = %d, calls = %+v", len(calls), calls)
|
||||
}
|
||||
if calls[0].Arguments["command"] != "pwd" {
|
||||
t.Fatalf("first command = %+v", calls[0].Arguments)
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user