Release v1.4.3

2026-04-30 18:20:04 +08:00
parent a2f777a1a8
commit a02fd51c19
24 changed files with 1909 additions and 1176 deletions
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -1,424 +1,316 @@
-# lingma-ipc-proxy 架构文档
+# lingma-ipc-proxy Architecture

-本文档描述 lingma-ipc-proxy 的系统架构、工作原理和核心流程。
+This document describes the current architecture of `lingma-ipc-proxy`, including both backend modes:
+
+- `ipc`: bridge to the local Lingma IDE plugin transport
+- `remote`: call Lingma remote HTTP APIs directly with detected credentials

 ---

-## 1. 整体架构
+## 1. System Overview

-```
-┌─────────────────────────────────────────────────────────────────────────┐
-│                              客户端层                                     │
-│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐ │
-│  │ Claude Code  │  │   OpenAI     │  │   Cline      │  │   Continue   │ │
-│  │  (Anthropic) │  │    SDK       │  │  (OpenAI)    │  │  (OpenAI)    │ │
-│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘ │
-└─────────┼─────────────────┼─────────────────┼─────────────────┼─────────┘
-          │                 │                 │                 │
-          └─────────────────┴────────┬────────┴─────────────────┘
-                                     │ HTTP
-                                     ▼
-┌─────────────────────────────────────────────────────────────────────────┐
-│                         lingma-ipc-proxy                                │
-│  ┌─────────────────────────────────────────────────────────────────┐    │
-│  │  internal/httpapi                                                │    │
-│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐ │    │
-│  │  │ /v1/models  │  │/v1/chat/comp│  │    /v1/messages         │ │    │
-│  │  │  (GET)      │  │  (POST)     │  │    (POST)               │ │    │
-│  │  └──────┬──────┘  └──────┬──────┘  └───────────┬─────────────┘ │    │
-│  │         └─────────────────┴──────────┬──────────┘               │    │
-│  │                                      │ normalizeRequest         │    │
-│  │                                      ▼                          │    │
-│  │  ┌─────────────────────────────────────────────────────────┐   │    │
-│  │  │              internal/service                            │   │    │
-│  │  │  ┌──────────┐  ┌──────────┐  ┌────────────────────────┐ │   │    │
-│  │  │  │ Session  │  │  Prompt  │  │    Stream/Event        │ │   │    │
-│  │  │  │ Manager  │  │ Builder  │  │    Handler             │ │   │    │
-│  │  │  └────┬─────┘  └────┬─────┘  └───────────┬────────────┘ │   │    │
-│  │  │       └─────────────┴──────────┬─────────┘              │   │    │
-│  │  │                              │ buildLingmaPrompt       │   │    │
-│  │  │                              ▼                          │   │    │
-│  │  │  ┌─────────────────────────────────────────────────┐   │   │    │
-│  │  │  │          internal/lingmaipc                      │   │   │    │
-│  │  │  │  ┌──────────────┐  ┌──────────────────────────┐ │   │   │    │
-│  │  │  │  │   WebSocket  │  │    Named Pipe (Win)      │ │   │   │    │
-│  │  │  │  │  Transport   │  │    Transport             │ │   │   │    │
-│  │  │  │  └──────┬───────┘  └───────────┬──────────────┘ │   │   │    │
-│  │  │  └─────────┼──────────────────────┼────────────────┘   │   │    │
-│  │  └────────────┼──────────────────────┼────────────────────┘   │    │
-│  │               │                      │                         │    │
-│  │  ┌────────────┼──────────────────────┼────────────────────┐   │    │
-│  │  │            ▼                      ▼                    │   │    │
-│  │  │  ┌─────────────────────────────────────────────────┐  │   │    │
-│  │  │  │      internal/toolemulation                      │  │   │    │
-│  │  │  │  ┌──────────────┐  ┌──────────────────────────┐ │  │   │    │
-│  │  │  │  │InjectTooling │  │   ParseActionBlocks      │ │  │   │    │
-│  │  │  │  │  (Prompt)    │  │   (Response)             │ │  │   │    │
-│  │  │  │  └──────────────┘  └──────────────────────────┘ │  │   │    │
-│  │  │  └─────────────────────────────────────────────────┘  │   │    │
-│  │  └───────────────────────────────────────────────────────┘   │    │
-│  └───────────────────────────────────────────────────────────────┘    │
-└─────────────────────────────────────────────────────────────────────────┘
-                                     │
-                                     │ WebSocket / Named Pipe
-                                     ▼
-┌─────────────────────────────────────────────────────────────────────────┐
-│                         Lingma 后端进程                                  │
-│              (VS Code 插件的本地 IPC 服务)                                │
-│                   ws://127.0.0.1:8899/ws                                │
-└─────────────────────────────────────────────────────────────────────────┘
-                                     │
-                                     │ HTTP API
-                                     ▼
-┌─────────────────────────────────────────────────────────────────────────┐
-│                         云端模型服务                                     │
-│              (Kimi-K2.6 / Qwen3-Max / MiniMax-M2.7 等)                  │
-└─────────────────────────────────────────────────────────────────────────┘
+```mermaid
+flowchart LR
+    A["Clients<br/>Claude Code / Hermes / Cline / Continue / OpenAI SDK / Anthropic SDK"]
+    B["internal/httpapi<br/>OpenAI + Anthropic compatible routes"]
+    C["internal/service<br/>request normalization / session policy / streaming / fallback"]
+    D["internal/toolemulation<br/>tool prompt injection + action block parsing"]
+    E["internal/lingmaipc<br/>WebSocket / Named Pipe"]
+    F["internal/remote<br/>credential detection / model list / chat / SSE"]
+    G["Lingma plugin local process"]
+    H["Lingma remote API"]
+    I["Desktop app<br/>Wails GUI / logs / token stats / persisted state"]
+
+    A --> B
+    I --> B
+    B --> C
+    C --> D
+    C --> E
+    C --> F
+    E --> G
+    F --> H
 ```

 ---

-## 2. 模块职责
+## 2. Runtime Modes

-### 2.1 internal/httpapi
+### 2.1 IPC mode

-HTTP API 适配层，负责将外部请求转换为内部 `service.ChatRequest`。
+`backend=ipc`

-| 端点 | 协议 | 功能 |
-|------|------|------|
-| `GET /v1/models` | OpenAI | 返回可用模型列表 |
-| `POST /v1/chat/completions` | OpenAI | 聊天补全（流式/非流式） |
-| `POST /v1/messages` | Anthropic | 消息接口（流式/非流式） |
+- Reads local plugin transport information
+- Connects through:
+  - WebSocket on macOS / Linux
+  - Named Pipe on Windows
+- Reuses Lingma plugin session semantics
+- Session/environment options in the desktop UI apply only here

-**核心函数：**
- `handleOpenAIChatCompletions()` - 处理 OpenAI 格式请求
- `handleAnthropicMessages()` - 处理 Anthropic 格式请求
- `normalizeOpenAIRequest()` / `normalizeAnthropicRequest()` - 归一化请求
+### 2.2 Remote API mode

-**关键设计：**
- 支持 CORS 预检请求 (`OPTIONS`)
- 单请求并发控制 (`tryAcquire()` / `release()`)
- 流式响应通过 `http.Flusher` 实现 SSE
+`backend=remote`

-### 2.2 internal/service
-
-业务逻辑层，负责会话管理和 Prompt 构建。
-
-**核心结构：**
-```go
-type Service struct {
-    cfg              Config
-    client           *lingmaipc.Client
-    stickySessionID  string
-    stickyModelID    string
-}
-```
-
-**核心函数：**
- `Generate()` - 非流式生成
- `GenerateStream()` - 流式生成（返回 `events` + `done` channel）
- `buildLingmaPrompt()` - 构建 Lingma 原生 Prompt
- `runPromptLocked()` - 发送 `session/prompt` RPC 并监听 `session/update` 通知
-
-**会话模式：**
-| 模式 | 行为 |
-|------|------|
-| `reuse` | 复用 sticky session，多轮对话保持上下文 |
-| `fresh` | 每个请求新建临时 session，完成后删除 |
-| `auto` | 单轮请求复用；带 system/history 的请求用 fresh |
-
-### 2.3 internal/lingmaipc
-
-IPC 通信层，负责与 Lingma 后端进程建立连接。
-
-**传输方式：**
-| 平台 | 默认传输 | 说明 |
-|------|----------|------|
-| Windows | Named Pipe | `\\.\pipe\lingma-*` |
-| macOS/Linux | WebSocket | `ws://127.0.0.1:{port}/ws` |
-
-**连接发现：**
- 读取 VS Code 插件缓存：`~/.config/Lingma/SharedClientCache/.info.json`
- 获取 WebSocket 端口号
- 自动重连机制
-
-**RPC 协议：**
- `session/new` - 创建会话
- `session/prompt` - 发送用户消息
- `session/update` - 接收流式响应通知
- `session/set_model` - 切换模型
- `chat/deleteSessionById` - 删除会话
-
-### 2.4 internal/toolemulation
-
-Tool 调用模拟层，将标准 `tools` 协议转换为 Prompt 层契约。
-
-**核心流程：**
-```
-Client tools ──→ ExtractAnthropicTools() ──→ []Tool
-                    │
-                    ▼
-              InjectTooling() ──→ System Prompt + Tool 说明
-                    │
-                    ▼
-              模型输出 action block
-                    │
-                    ▼
-              ParseActionBlocks() ──→ []ToolCall
-                    │
-                    ▼
-              编码为 Anthropic tool_use / OpenAI tool_calls
-```
-
-**Prompt 契约格式：**
-```
-```json action
-{"tool":"NAME","parameters":{"key":"value"}}
-```
-```
-
-**支持格式：**
- `{"tool":"X","parameters":{}}` ✅ 标准格式
- `{"tool":"X","arguments":{}}` ✅ 兼容格式
- `{"tool":"X","input":{}}` ✅ 兼容格式
- `{"tool":"X","arg1":"val"}` ✅ 顶层参数（部分模型）
+- Reads Lingma remote base URL
+- Loads credentials from:
+  - explicit `remote_auth_file`
+  - or detected Lingma cache under `~/.lingma`
+- Calls remote model list and chat endpoints directly
+- Supports timeout / 429 / 5xx fallback across available remote models
+- Does not use local plugin session environment knobs

 ---

-## 3. 核心流程
+## 3. Module Responsibilities

-### 3.1 普通聊天请求流程
+### 3.1 `cmd/lingma-ipc-proxy`
+
+Entry point and config loading.
+
+Responsibilities:
+
+- parse CLI flags
+- merge config file + environment + flags
+- choose backend mode
+- build `service.Config`
+- start `internal/httpapi.Server`
+
+Important config fields:
+
+- `backend`
+- `transport`
+- `websocket_url`
+- `pipe`
+- `remote_base_url`
+- `remote_auth_file`
+- `remote_version`
+- `remote_fallback_enabled`
+- `remote_fallback_models`
+
+### 3.2 `internal/httpapi`
+
+Compatibility layer for OpenAI and Anthropic style APIs.
+
+Primary routes:
+
+- `GET /v1/models`
+- `POST /v1/chat/completions`
+- `POST /v1/messages`
+- `GET /health`
+- `GET /props`
+
+Responsibilities:
+
+- normalize OpenAI / Anthropic requests into `service.ChatRequest`
+- convert service results back to OpenAI / Anthropic payloads
+- stream SSE responses
+- sanitize and record request / response payloads for debug UI
+
+### 3.3 `internal/service`
+
+Core orchestration layer.
+
+Responsibilities:
+
+- choose active backend
+- warm up backend connection / credentials
+- list models
+- generate non-streaming responses
+- generate streaming responses
+- apply session reuse policy in IPC mode
+- inject / parse tool emulation
+- normalize image inputs
+- apply remote fallback order
+
+Important behavior split:
+
+- IPC path uses `internal/lingmaipc`
+- Remote path uses `internal/remote`
+
+### 3.4 `internal/lingmaipc`
+
+Local transport client for Lingma plugin IPC.
+
+Responsibilities:
+
+- detect WebSocket / pipe endpoint
+- dial and reconnect
+- send RPC messages such as:
+  - `session/new`
+  - `session/prompt`
+  - `session/set_model`
+  - `chat/deleteSessionById`
+- consume `session/update` notifications
+
+### 3.5 `internal/remote`
+
+Remote HTTP client for Lingma cloud APIs.
+
+Responsibilities:
+
+- resolve base URL
+- load and validate credentials
+- derive machine / user identity for remote auth
+- list remote models
+- call remote chat endpoint
+- handle remote SSE streaming
+
+### 3.6 `internal/toolemulation`
+
+Prompt-based tool bridge for models that do not expose native tool calling in Lingma transport.
+
+Responsibilities:
+
+- extract tool definitions from OpenAI / Anthropic requests
+- append tool contract to prompt
+- parse JSON action blocks from model output
+- project tool calls back to:
+  - Anthropic `tool_use`
+  - OpenAI `tool_calls`
+
+---
+
+## 4. Request Flow
+
+### 4.1 Shared ingress flow

 ```mermaid
 sequenceDiagram
-    participant C as Client
-    participant H as HTTP API
-    participant S as Service
-    participant L as Lingma IPC
-    participant B as Lingma Backend
+    participant Client
+    participant HTTP as httpapi
+    participant Service as service

-    C->>H: POST /v1/messages
-    H->>H: normalizeAnthropicRequest()
-    H->>S: GenerateStream(req)
-    S->>S: ensureConnected()
-    S->>S: resolveSession()
-    S->>S: buildLingmaPrompt()
-    S->>L: Send("session/prompt", params)
-    L->>B: WebSocket RPC
-    B->>L: session/update (agent_message_chunk)
-    loop 流式响应
-        L->>S: notification (chunk)
-        S->>H: events <- StreamEvent{Delta}
-        H->>C: SSE: content_block_delta
-    end
-    B->>L: session/update (chat_finish)
-    L->>S: notification (finish)
-    S->>H: done <- StreamResult
-    H->>C: SSE: message_stop
+    Client->>HTTP: OpenAI/Anthropic request
+    HTTP->>HTTP: normalize request
+    HTTP->>Service: Generate / GenerateStream
 ```

-### 3.2 Tool 调用流程
+### 4.2 IPC backend flow

 ```mermaid
 sequenceDiagram
-    participant C as Client
-    participant H as HTTP API
-    participant T as ToolEmulation
-    participant S as Service
-    participant L as Lingma IPC
+    participant Service
+    participant Tool as toolemulation
+    participant IPC as lingmaipc
+    participant Plugin as Lingma plugin

-    C->>H: POST /v1/messages (with tools)
-    H->>T: ExtractAnthropicTools()
-    H->>S: GenerateStream(req)
-    S->>T: InjectTooling(system, tools)
-    S->>L: session/prompt (with tool prompt)
-    L->>S: response (with action blocks)
-    S->>T: ParseActionBlocks(text)
-    T->>S: []ToolCall
-    S->>H: ChatResult{Text, ToolCalls}
-    H->>C: SSE: tool_use blocks
-
-    C->>H: POST /v1/messages (tool_result)
-    H->>T: ActionOutputPrompt(toolUseID, content)
-    H->>S: GenerateStream(req)
-    S->>L: session/prompt (with tool result)
-    L->>S: response
-    S->>H: ChatResult
-    H->>C: SSE: final response
+    Service->>Tool: inject tool contract if needed
+    Service->>IPC: ensure connected
+    Service->>IPC: create/reuse session
+    Service->>IPC: session/prompt
+    IPC->>Plugin: RPC message
+    Plugin-->>IPC: session/update chunks
+    IPC-->>Service: stream events
+    Service-->>Service: parse tool blocks / image references / stop reason
 ```

-### 3.3 图片传输流程
+### 4.3 Remote backend flow

 ```mermaid
 sequenceDiagram
-    participant C as Client
-    participant H as HTTP API
-    participant S as Service
-    participant L as Lingma IPC
+    participant Service
+    participant Remote as remote client
+    participant API as Lingma remote API

-    C->>H: POST /v1/messages (with image)
-    H->>H: extractAnthropicImages()
-    H->>S: ChatRequest{Images: [...]}
-    S->>S: runPromptLocked()
-    Note over S: 1. 保存 base64 到 /tmp/lingma-img-*.ext
-    Note over S: 2. 构建 URI: lingma:///agent/file?path=...
-    S->>L: session/prompt
-    Note over L: prompt: [{type:"text"}, {type:"image", mimeType, uri, data}]
-    L->>S: response (model sees image)
-    S->>H: ChatResult
-    H->>C: SSE response
-```
-
-### 3.4 流式输出 SSE 事件序列
-
-**Anthropic 格式（流式）：**
-```
-event: message_start
-data: {"type":"message_start","message":{...}}
-
-event: content_block_start
-data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
-
-event: content_block_delta
-data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"你"}}
-
-event: content_block_delta
-data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"好"}}
-
-... (更多 delta)
-
-event: content_block_stop
-data: {"type":"content_block_stop","index":0}
-
-[如有 tool_calls]
-event: content_block_start
-data: {"type":"content_block_start","index":1,"content_block":{"type":"tool_use","id":"...","name":"Bash","input":{"command":"ls /"}}}
-
-event: content_block_stop
-data: {"type":"content_block_stop","index":1}
-
-event: message_delta
-data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":5}}
-
-event: message_stop
-data: {"type":"message_stop"}
+    Service->>Remote: load credentials / ensure client
+    Service->>Remote: list models if needed
+    Service->>Remote: chat request
+    Remote->>API: HTTPS request
+    API-->>Remote: JSON or SSE response
+    Remote-->>Service: normalized result
+    Service-->>Service: fallback to next model when allowed
 ```

 ---

-## 4. 关键技术决策
+## 5. Remote Fallback Strategy

-### 4.1 为什么使用 Tool Emulation 而非原生 Tool Calling？
+Remote fallback is used only when all conditions are true:

-Lingma 后端模型（Kimi、Qwen 等）不原生支持 OpenAI/Anthropic 的 `tools` 协议。因此代理层需要将工具定义注入到 Prompt 中，通过结构化文本输出模拟工具调用。
+- `backend=remote`
+- `remote_fallback_enabled=true`
+- request has not emitted stream output yet
+- upstream error matches timeout / 429 / 5xx class

-**优点：**
- 不依赖上游模型能力
- 兼容任何纯聊天模型
- 可精确控制 Prompt 格式
+Current default order:

-**缺点：**
- 模型需要学习特定格式
- 解析可能有容错问题
- 增加了 Prompt 长度
+1. `kmodel`
+2. `mmodel`
+3. `dashscope_qwen3_coder`
+4. `dashscope_qmodel`
+5. `dashscope_qwen_max_latest`
+6. `dashscope_qwen_plus_20250428_thinking`

-### 4.2 为什么使用 WebSocket/Named Pipe 而非 HTTP？
-
-Lingma 插件使用本地 IPC 与后端通信，优势：
- 低延迟（本地通信）
- 双向实时通知（session/update）
- 认证信息由插件管理，代理无需处理
-
-### 4.3 图片传输的双保险策略
-
-```
-Prompt 数组 (Lingma 原生格式):
-[
-  {"type":"text","text":"..."},
-  {"type":"image","mimeType":"image/png","uri":"lingma:///agent/file?path=...","data":"base64..."}
-]
-```
-
- `uri`: Lingma 后端必须验证的本地文件路径
- `data`: base64 编码的图像数据（备用）
- `mimeType`: 图像类型标识
-
-### 4.4 单请求并发控制
-
-Lingma IPC 一次只能处理一个请求，因此代理使用 `tryAcquire()` 机制：
-
-```go
-if !s.tryAcquire() {
-    writeAnthropicError(w, 429, "rate_limit_error",
-        "Lingma IPC proxy handles one request at a time.")
-    return
-}
-defer s.release()
-```
+Before using that order, the service filters candidates against the actual `/v1/models` result from the remote backend so unavailable models are skipped.

 ---

-## 5. 配置说明
+## 6. Desktop App Architecture

-### 5.1 配置文件结构
+The Wails desktop app is a management UI around the local proxy process.

-```json
-{
-  "host": "127.0.0.1",
-  "port": 8095,
-  "transport": "websocket",
-  "mode": "agent",
-  "shell_type": "zsh",
-  "session_mode": "auto",
-  "timeout": 120,
-  "cwd": "/Users/tiancheng"
-}
-```
+Responsibilities:

-### 5.2 配置项说明
+- start / stop / restart proxy
+- show current backend and resolved endpoints
+- persist:
+  - request history
+  - logs
+  - token statistics
+- show detected IPC and remote credentials metadata
+- edit config and restart proxy on save

-| 配置项 | 类型 | 默认值 | 说明 |
-|--------|------|--------|------|
-| `host` | string | `127.0.0.1` | HTTP 监听地址 |
-| `port` | int | `8095` | HTTP 监听端口 |
-| `transport` | string | `auto` | IPC 传输方式：`auto`/`pipe`/`websocket` |
-| `mode` | string | `chat` | 模式：`chat`/`agent` |
-| `shell_type` | string | `powershell` | 终端类型 |
-| `session_mode` | string | `auto` | 会话模式：`reuse`/`fresh`/`auto` |
-| `timeout` | int | `120` | 请求超时（秒） |
-| `cwd` | string | `""` | 工作目录（传给 Lingma 后端） |
+Persisted local state:
+
+- config: `~/.config/lingma-ipc-proxy/config.json`
+- UI/runtime state: `~/.config/lingma-ipc-proxy/app-state.json`
+
+Production packaging rules:
+
+- packaged app should not auto-open inspector
+- local development can opt in with `LINGMA_DESKTOP_DEBUG=1`

 ---

-## 6. 扩展点
+## 7. Key Design Decisions

-### 6.1 添加新模型
+### 7.1 Why keep both IPC and remote modes?

-在 `service.go` 的模型映射中添加：
+Because the two modes solve different problems:

-```go
-func (s *Service) resolveInternalModelID(model string) string {
-    switch strings.ToLower(strings.TrimSpace(model)) {
-    case "kimi-k2.6":
-        return "kimi2.6"
-    case "qwen3-max":
-        return "qwen3max"
-    // 添加新模型映射
-    default:
-        return ""
-    }
-}
-```
+- IPC mode preserves plugin session semantics and local tool environment
+- Remote mode avoids plugin runtime coupling and is usually better for third-party agent clients

-### 6.2 添加新 Tool 格式支持
+### 7.2 Why keep tool emulation even with remote mode?

-在 `toolemulation.go` 的 `parseToolCallJSON()` 中扩展参数解析逻辑。
+Because Lingma-exposed models are not guaranteed to speak OpenAI/Anthropic native tool protocol consistently across all routes. The proxy must keep a stable external contract even when the upstream model capability is uneven.

-### 6.3 添加新 API 端点
+### 7.3 Why persist requests and token stats in the desktop app?

-在 `httpapi/server.go` 的 `NewServer()` 中注册新路由。
+Because the GUI is used as an operational console, not a transient preview. Users need model usage, logs, and recent traffic to survive app restarts.

 ---

-*文档版本: 2025-04-25*
-*对应代码版本: 当前 master*
+## 8. Known Boundaries
+
+- IPC mode still has stronger environment coupling with the local Lingma plugin
+- remote credential detection depends on local Lingma cache / auth file layout
+- image payloads are sanitized in persisted request logs to avoid oversized local state
+- request history may contain mixed models in remote mode when fallback is triggered or when different clients specify different models
+
+---
+
+## 9. Files to Read First
+
+If you are extending the system, start here:
+
+- `cmd/lingma-ipc-proxy/main.go`
+- `internal/httpapi/server.go`
+- `internal/service/service.go`
+- `internal/lingmaipc/*`
+- `internal/remote/*`
+- `desktop/app.go`
+- `desktop/main.go`
+
+---
+
+Document version: 2026-04-30
--- a/docs/architecture.zh-CN.md
+++ b/docs/architecture.zh-CN.md
@@ -1,424 +1,315 @@
 # lingma-ipc-proxy 架构文档

-本文档描述 lingma-ipc-proxy 的系统架构、工作原理和核心流程。
+本文档描述 `lingma-ipc-proxy` 的当前架构，覆盖两种后端模式：
+
+- `ipc`：桥接本地 Lingma IDE 插件传输层
+- `remote`：直接调用 Lingma 远端 HTTP API

 ---

-## 1. 整体架构
+## 1. 系统总览

-```
-┌─────────────────────────────────────────────────────────────────────────┐
-│                              客户端层                                     │
-│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐ │
-│  │ Claude Code  │  │   OpenAI     │  │   Cline      │  │   Continue   │ │
-│  │  (Anthropic) │  │    SDK       │  │  (OpenAI)    │  │  (OpenAI)    │ │
-│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘ │
-└─────────┼─────────────────┼─────────────────┼─────────────────┼─────────┘
-          │                 │                 │                 │
-          └─────────────────┴────────┬────────┴─────────────────┘
-                                     │ HTTP
-                                     ▼
-┌─────────────────────────────────────────────────────────────────────────┐
-│                         lingma-ipc-proxy                                │
-│  ┌─────────────────────────────────────────────────────────────────┐    │
-│  │  internal/httpapi                                                │    │
-│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐ │    │
-│  │  │ /v1/models  │  │/v1/chat/comp│  │    /v1/messages         │ │    │
-│  │  │  (GET)      │  │  (POST)     │  │    (POST)               │ │    │
-│  │  └──────┬──────┘  └──────┬──────┘  └───────────┬─────────────┘ │    │
-│  │         └─────────────────┴──────────┬──────────┘               │    │
-│  │                                      │ normalizeRequest         │    │
-│  │                                      ▼                          │    │
-│  │  ┌─────────────────────────────────────────────────────────┐   │    │
-│  │  │              internal/service                            │   │    │
-│  │  │  ┌──────────┐  ┌──────────┐  ┌────────────────────────┐ │   │    │
-│  │  │  │ Session  │  │  Prompt  │  │    Stream/Event        │ │   │    │
-│  │  │  │ Manager  │  │ Builder  │  │    Handler             │ │   │    │
-│  │  │  └────┬─────┘  └────┬─────┘  └───────────┬────────────┘ │   │    │
-│  │  │       └─────────────┴──────────┬─────────┘              │   │    │
-│  │  │                              │ buildLingmaPrompt       │   │    │
-│  │  │                              ▼                          │   │    │
-│  │  │  ┌─────────────────────────────────────────────────┐   │   │    │
-│  │  │  │          internal/lingmaipc                      │   │   │    │
-│  │  │  │  ┌──────────────┐  ┌──────────────────────────┐ │   │   │    │
-│  │  │  │  │   WebSocket  │  │    Named Pipe (Win)      │ │   │   │    │
-│  │  │  │  │  Transport   │  │    Transport             │ │   │   │    │
-│  │  │  │  └──────┬───────┘  └───────────┬──────────────┘ │   │   │    │
-│  │  │  └─────────┼──────────────────────┼────────────────┘   │   │    │
-│  │  └────────────┼──────────────────────┼────────────────────┘   │    │
-│  │               │                      │                         │    │
-│  │  ┌────────────┼──────────────────────┼────────────────────┐   │    │
-│  │  │            ▼                      ▼                    │   │    │
-│  │  │  ┌─────────────────────────────────────────────────┐  │   │    │
-│  │  │  │      internal/toolemulation                      │  │   │    │
-│  │  │  │  ┌──────────────┐  ┌──────────────────────────┐ │  │   │    │
-│  │  │  │  │InjectTooling │  │   ParseActionBlocks      │ │  │   │    │
-│  │  │  │  │  (Prompt)    │  │   (Response)             │ │  │   │    │
-│  │  │  │  └──────────────┘  └──────────────────────────┘ │  │   │    │
-│  │  │  └─────────────────────────────────────────────────┘  │   │    │
-│  │  └───────────────────────────────────────────────────────┘   │    │
-│  └───────────────────────────────────────────────────────────────┘    │
-└─────────────────────────────────────────────────────────────────────────┘
-                                     │
-                                     │ WebSocket / Named Pipe
-                                     ▼
-┌─────────────────────────────────────────────────────────────────────────┐
-│                         Lingma 后端进程                                  │
-│              (VS Code 插件的本地 IPC 服务)                                │
-│                   ws://127.0.0.1:8899/ws                                │
-└─────────────────────────────────────────────────────────────────────────┘
-                                     │
-                                     │ HTTP API
-                                     ▼
-┌─────────────────────────────────────────────────────────────────────────┐
-│                         云端模型服务                                     │
-│              (Kimi-K2.6 / Qwen3-Max / MiniMax-M2.7 等)                  │
-└─────────────────────────────────────────────────────────────────────────┘
+```mermaid
+flowchart LR
+    A["客户端<br/>Claude Code / Hermes / Cline / Continue / OpenAI SDK / Anthropic SDK"]
+    B["internal/httpapi<br/>OpenAI + Anthropic 兼容路由"]
+    C["internal/service<br/>请求归一化 / 会话策略 / 流式输出 / 兜底"]
+    D["internal/toolemulation<br/>工具提示词注入 + action block 解析"]
+    E["internal/lingmaipc<br/>WebSocket / 命名管道"]
+    F["internal/remote<br/>登录态探测 / 模型列表 / Chat / SSE"]
+    G["Lingma 插件本地进程"]
+    H["Lingma 远端 API"]
+    I["桌面端 GUI<br/>Wails / 日志 / Token 统计 / 持久化状态"]
+
+    A --> B
+    I --> B
+    B --> C
+    C --> D
+    C --> E
+    C --> F
+    E --> G
+    F --> H
 ```

 ---

-## 2. 模块职责
+## 2. 运行模式

-### 2.1 internal/httpapi
+### 2.1 IPC 模式

-HTTP API 适配层，负责将外部请求转换为内部 `service.ChatRequest`。
+`backend=ipc`

-| 端点 | 协议 | 功能 |
-|------|------|------|
-| `GET /v1/models` | OpenAI | 返回可用模型列表 |
-| `POST /v1/chat/completions` | OpenAI | 聊天补全（流式/非流式） |
-| `POST /v1/messages` | Anthropic | 消息接口（流式/非流式） |
+- 读取本地 Lingma 插件传输信息
+- 通过以下方式连接：
+  - macOS / Linux：WebSocket
+  - Windows：Named Pipe
+- 复用 Lingma 插件自身的 session 语义
+- 桌面端里“会话与环境”相关配置只在这里生效

-**核心函数：**
- `handleOpenAIChatCompletions()` - 处理 OpenAI 格式请求
- `handleAnthropicMessages()` - 处理 Anthropic 格式请求
- `normalizeOpenAIRequest()` / `normalizeAnthropicRequest()` - 归一化请求
+### 2.2 Remote API 模式

-**关键设计：**
- 支持 CORS 预检请求 (`OPTIONS`)
- 单请求并发控制 (`tryAcquire()` / `release()`)
- 流式响应通过 `http.Flusher` 实现 SSE
+`backend=remote`

-### 2.2 internal/service
-
-业务逻辑层，负责会话管理和 Prompt 构建。
-
-**核心结构：**
-```go
-type Service struct {
-    cfg              Config
-    client           *lingmaipc.Client
-    stickySessionID  string
-    stickyModelID    string
-}
-```
-
-**核心函数：**
- `Generate()` - 非流式生成
- `GenerateStream()` - 流式生成（返回 `events` + `done` channel）
- `buildLingmaPrompt()` - 构建 Lingma 原生 Prompt
- `runPromptLocked()` - 发送 `session/prompt` RPC 并监听 `session/update` 通知
-
-**会话模式：**
-| 模式 | 行为 |
-|------|------|
-| `reuse` | 复用 sticky session，多轮对话保持上下文 |
-| `fresh` | 每个请求新建临时 session，完成后删除 |
-| `auto` | 单轮请求复用；带 system/history 的请求用 fresh |
-
-### 2.3 internal/lingmaipc
-
-IPC 通信层，负责与 Lingma 后端进程建立连接。
-
-**传输方式：**
-| 平台 | 默认传输 | 说明 |
-|------|----------|------|
-| Windows | Named Pipe | `\\.\pipe\lingma-*` |
-| macOS/Linux | WebSocket | `ws://127.0.0.1:{port}/ws` |
-
-**连接发现：**
- 读取 VS Code 插件缓存：`~/.config/Lingma/SharedClientCache/.info.json`
- 获取 WebSocket 端口号
- 自动重连机制
-
-**RPC 协议：**
- `session/new` - 创建会话
- `session/prompt` - 发送用户消息
- `session/update` - 接收流式响应通知
- `session/set_model` - 切换模型
- `chat/deleteSessionById` - 删除会话
-
-### 2.4 internal/toolemulation
-
-Tool 调用模拟层，将标准 `tools` 协议转换为 Prompt 层契约。
-
-**核心流程：**
-```
-Client tools ──→ ExtractAnthropicTools() ──→ []Tool
-                    │
-                    ▼
-              InjectTooling() ──→ System Prompt + Tool 说明
-                    │
-                    ▼
-              模型输出 action block
-                    │
-                    ▼
-              ParseActionBlocks() ──→ []ToolCall
-                    │
-                    ▼
-              编码为 Anthropic tool_use / OpenAI tool_calls
-```
-
-**Prompt 契约格式：**
-```
-```json action
-{"tool":"NAME","parameters":{"key":"value"}}
-```
-```
-
-**支持格式：**
- `{"tool":"X","parameters":{}}` ✅ 标准格式
- `{"tool":"X","arguments":{}}` ✅ 兼容格式
- `{"tool":"X","input":{}}` ✅ 兼容格式
- `{"tool":"X","arg1":"val"}` ✅ 顶层参数（部分模型）
+- 解析远端域名
+- 加载认证信息：
+  - 显式指定的 `remote_auth_file`
+  - 或自动探测 `~/.lingma` 下的缓存
+- 直接请求远端模型列表和聊天接口
+- 支持远端超时 / 429 / 5xx 的模型兜底切换
+- 不依赖本地插件会话环境参数

 ---

-## 3. 核心流程
+## 3. 模块职责

-### 3.1 普通聊天请求流程
+### 3.1 `cmd/lingma-ipc-proxy`
+
+入口与配置装配层。
+
+职责：
+
+- 解析命令行参数
+- 合并配置文件 / 环境变量 / CLI flags
+- 选择后端模式
+- 构建 `service.Config`
+- 启动 `internal/httpapi.Server`
+
+关键配置字段：
+
+- `backend`
+- `transport`
+- `websocket_url`
+- `pipe`
+- `remote_base_url`
+- `remote_auth_file`
+- `remote_version`
+- `remote_fallback_enabled`
+- `remote_fallback_models`
+
+### 3.2 `internal/httpapi`
+
+OpenAI / Anthropic 兼容层。
+
+主要路由：
+
+- `GET /v1/models`
+- `POST /v1/chat/completions`
+- `POST /v1/messages`
+- `GET /health`
+- `GET /props`
+
+职责：
+
+- 把 OpenAI / Anthropic 请求归一化为 `service.ChatRequest`
+- 把 service 结果重新编码成 OpenAI / Anthropic 响应
+- 输出 SSE 流
+- 记录调试用请求 / 响应摘要
+
+### 3.3 `internal/service`
+
+核心编排层。
+
+职责：
+
+- 选择当前 backend
+- backend 预热
+- 拉取模型列表
+- 非流式生成
+- 流式生成
+- IPC 模式下的 session 复用策略
+- 工具模拟注入与解析
+- 图片输入归一化
+- 远端 fallback 顺序控制
+
+分支逻辑：
+
+- IPC 路径走 `internal/lingmaipc`
+- Remote 路径走 `internal/remote`
+
+### 3.4 `internal/lingmaipc`
+
+本地 Lingma 插件 IPC 客户端。
+
+职责：
+
+- 自动探测 WebSocket / pipe 端点
+- 建立连接与重连
+- 发送 RPC：
+  - `session/new`
+  - `session/prompt`
+  - `session/set_model`
+  - `chat/deleteSessionById`
+- 消费 `session/update` 通知
+
+### 3.5 `internal/remote`
+
+Lingma 远端 HTTP 客户端。
+
+职责：
+
+- 解析远端 base URL
+- 加载并校验登录态
+- 生成远端请求所需身份信息
+- 获取远端模型列表
+- 调用远端聊天接口
+- 处理远端 SSE 流式响应
+
+### 3.6 `internal/toolemulation`
+
+工具调用模拟层。
+
+职责：
+
+- 从 OpenAI / Anthropic 请求中提取工具定义
+- 将工具契约注入 prompt
+- 从模型文本里解析 JSON action block
+- 回投为：
+  - Anthropic `tool_use`
+  - OpenAI `tool_calls`
+
+---
+
+## 4. 请求主流程
+
+### 4.1 通用入口

 ```mermaid
 sequenceDiagram
-    participant C as Client
-    participant H as HTTP API
-    participant S as Service
-    participant L as Lingma IPC
-    participant B as Lingma Backend
+    participant Client as Client
+    participant HTTP as httpapi
+    participant Service as service

-    C->>H: POST /v1/messages
-    H->>H: normalizeAnthropicRequest()
-    H->>S: GenerateStream(req)
-    S->>S: ensureConnected()
-    S->>S: resolveSession()
-    S->>S: buildLingmaPrompt()
-    S->>L: Send("session/prompt", params)
-    L->>B: WebSocket RPC
-    B->>L: session/update (agent_message_chunk)
-    loop 流式响应
-        L->>S: notification (chunk)
-        S->>H: events <- StreamEvent{Delta}
-        H->>C: SSE: content_block_delta
-    end
-    B->>L: session/update (chat_finish)
-    L->>S: notification (finish)
-    S->>H: done <- StreamResult
-    H->>C: SSE: message_stop
+    Client->>HTTP: OpenAI / Anthropic 请求
+    HTTP->>HTTP: 归一化请求
+    HTTP->>Service: Generate / GenerateStream
 ```

-### 3.2 Tool 调用流程
+### 4.2 IPC 后端流程

 ```mermaid
 sequenceDiagram
-    participant C as Client
-    participant H as HTTP API
-    participant T as ToolEmulation
-    participant S as Service
-    participant L as Lingma IPC
+    participant Service as service
+    participant Tool as toolemulation
+    participant IPC as lingmaipc
+    participant Plugin as Lingma 插件

-    C->>H: POST /v1/messages (with tools)
-    H->>T: ExtractAnthropicTools()
-    H->>S: GenerateStream(req)
-    S->>T: InjectTooling(system, tools)
-    S->>L: session/prompt (with tool prompt)
-    L->>S: response (with action blocks)
-    S->>T: ParseActionBlocks(text)
-    T->>S: []ToolCall
-    S->>H: ChatResult{Text, ToolCalls}
-    H->>C: SSE: tool_use blocks
-
-    C->>H: POST /v1/messages (tool_result)
-    H->>T: ActionOutputPrompt(toolUseID, content)
-    H->>S: GenerateStream(req)
-    S->>L: session/prompt (with tool result)
-    L->>S: response
-    S->>H: ChatResult
-    H->>C: SSE: final response
+    Service->>Tool: 按需注入工具契约
+    Service->>IPC: ensure connected
+    Service->>IPC: 创建/复用 session
+    Service->>IPC: session/prompt
+    IPC->>Plugin: RPC
+    Plugin-->>IPC: session/update chunk
+    IPC-->>Service: 流式事件
+    Service-->>Service: 解析工具 block / 图片 / stop reason
 ```

-### 3.3 图片传输流程
+### 4.3 Remote 后端流程

 ```mermaid
 sequenceDiagram
-    participant C as Client
-    participant H as HTTP API
-    participant S as Service
-    participant L as Lingma IPC
+    participant Service as service
+    participant Remote as remote client
+    participant API as Lingma 远端 API

-    C->>H: POST /v1/messages (with image)
-    H->>H: extractAnthropicImages()
-    H->>S: ChatRequest{Images: [...]}
-    S->>S: runPromptLocked()
-    Note over S: 1. 保存 base64 到 /tmp/lingma-img-*.ext
-    Note over S: 2. 构建 URI: lingma:///agent/file?path=...
-    S->>L: session/prompt
-    Note over L: prompt: [{type:"text"}, {type:"image", mimeType, uri, data}]
-    L->>S: response (model sees image)
-    S->>H: ChatResult
-    H->>C: SSE response
-```
-
-### 3.4 流式输出 SSE 事件序列
-
-**Anthropic 格式（流式）：**
-```
-event: message_start
-data: {"type":"message_start","message":{...}}
-
-event: content_block_start
-data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
-
-event: content_block_delta
-data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"你"}}
-
-event: content_block_delta
-data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"好"}}
-
-... (更多 delta)
-
-event: content_block_stop
-data: {"type":"content_block_stop","index":0}
-
-[如有 tool_calls]
-event: content_block_start
-data: {"type":"content_block_start","index":1,"content_block":{"type":"tool_use","id":"...","name":"Bash","input":{"command":"ls /"}}}
-
-event: content_block_stop
-data: {"type":"content_block_stop","index":1}
-
-event: message_delta
-data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":5}}
-
-event: message_stop
-data: {"type":"message_stop"}
+    Service->>Remote: 加载登录态 / 初始化 client
+    Service->>Remote: 需要时拉取模型列表
+    Service->>Remote: 发送 chat 请求
+    Remote->>API: HTTPS
+    API-->>Remote: JSON 或 SSE
+    Remote-->>Service: 归一化结果
+    Service-->>Service: 按条件执行 fallback
 ```

 ---

-## 4. 关键技术决策
+## 5. 远端兜底策略

-### 4.1 为什么使用 Tool Emulation 而非原生 Tool Calling？
+仅在以下条件同时满足时启用：

-Lingma 后端模型（Kimi、Qwen 等）不原生支持 OpenAI/Anthropic 的 `tools` 协议。因此代理层需要将工具定义注入到 Prompt 中，通过结构化文本输出模拟工具调用。
+- `backend=remote`
+- `remote_fallback_enabled=true`
+- 还没有向客户端输出任何流式 token
+- 上游错误属于 timeout / 429 / 5xx

-**优点：**
- 不依赖上游模型能力
- 兼容任何纯聊天模型
- 可精确控制 Prompt 格式
+当前默认顺序：

-**缺点：**
- 模型需要学习特定格式
- 解析可能有容错问题
- 增加了 Prompt 长度
+1. `kmodel`
+2. `mmodel`
+3. `dashscope_qwen3_coder`
+4. `dashscope_qmodel`
+5. `dashscope_qwen_max_latest`
+6. `dashscope_qwen_plus_20250428_thinking`

-### 4.2 为什么使用 WebSocket/Named Pipe 而非 HTTP？
-
-Lingma 插件使用本地 IPC 与后端通信，优势：
- 低延迟（本地通信）
- 双向实时通知（session/update）
- 认证信息由插件管理，代理无需处理
-
-### 4.3 图片传输的双保险策略
-
-```
-Prompt 数组 (Lingma 原生格式):
-[
-  {"type":"text","text":"..."},
-  {"type":"image","mimeType":"image/png","uri":"lingma:///agent/file?path=...","data":"base64..."}
-]
-```
-
- `uri`: Lingma 后端必须验证的本地文件路径
- `data`: base64 编码的图像数据（备用）
- `mimeType`: 图像类型标识
-
-### 4.4 单请求并发控制
-
-Lingma IPC 一次只能处理一个请求，因此代理使用 `tryAcquire()` 机制：
-
-```go
-if !s.tryAcquire() {
-    writeAnthropicError(w, 429, "rate_limit_error",
-        "Lingma IPC proxy handles one request at a time.")
-    return
-}
-defer s.release()
-```
+实际执行前，service 会先拿远端 `/v1/models` 的真实结果过滤一遍，只保留当前账号真的可用的模型。

 ---

-## 5. 配置说明
+## 6. 桌面端架构

-### 5.1 配置文件结构
+Wails 桌面端不是简单预览壳，而是本地代理的运维控制台。

-```json
-{
-  "host": "127.0.0.1",
-  "port": 8095,
-  "transport": "websocket",
-  "mode": "agent",
-  "shell_type": "zsh",
-  "session_mode": "auto",
-  "timeout": 120,
-  "cwd": "/Users/tiancheng"
-}
-```
+职责：

-### 5.2 配置项说明
+- 启动 / 停止 / 重启代理
+- 展示当前 backend、监听地址、探测结果
+- 持久化：
+  - 请求历史
+  - 日志
+  - Token 统计
+- 编辑配置并保存后按需重启

-| 配置项 | 类型 | 默认值 | 说明 |
-|--------|------|--------|------|
-| `host` | string | `127.0.0.1` | HTTP 监听地址 |
-| `port` | int | `8095` | HTTP 监听端口 |
-| `transport` | string | `auto` | IPC 传输方式：`auto`/`pipe`/`websocket` |
-| `mode` | string | `chat` | 模式：`chat`/`agent` |
-| `shell_type` | string | `powershell` | 终端类型 |
-| `session_mode` | string | `auto` | 会话模式：`reuse`/`fresh`/`auto` |
-| `timeout` | int | `120` | 请求超时（秒） |
-| `cwd` | string | `""` | 工作目录（传给 Lingma 后端） |
+本地持久化路径：
+
+- 配置：`~/.config/lingma-ipc-proxy/config.json`
+- GUI 运行状态：`~/.config/lingma-ipc-proxy/app-state.json`
+
+打包要求：
+
+- 生产包不自动打开 Inspector / 调试入口
+- 本地开发可通过 `LINGMA_DESKTOP_DEBUG=1` 显式开启

 ---

-## 6. 扩展点
+## 7. 关键设计决策

-### 6.1 添加新模型
+### 7.1 为什么同时保留 IPC 和 Remote？

-在 `service.go` 的模型映射中添加：
+因为两种模式解决的问题不同：

-```go
-func (s *Service) resolveInternalModelID(model string) string {
-    switch strings.ToLower(strings.TrimSpace(model)) {
-    case "kimi-k2.6":
-        return "kimi2.6"
-    case "qwen3-max":
-        return "qwen3max"
-    // 添加新模型映射
-    default:
-        return ""
-    }
-}
-```
+- IPC 模式更贴近插件本地上下文和 session 语义
+- Remote 模式更适合第三方 agent 客户端，减少对插件运行态的依赖

-### 6.2 添加新 Tool 格式支持
+### 7.2 为什么 Remote 也保留 Tool Emulation？

-在 `toolemulation.go` 的 `parseToolCallJSON()` 中扩展参数解析逻辑。
+因为 Lingma 暴露出来的模型能力并不保证始终稳定兼容 OpenAI / Anthropic 原生 tools 协议。代理层必须对外提供稳定契约，不能把上游模型差异直接泄露给客户端。

-### 6.3 添加新 API 端点
+### 7.3 为什么桌面端要持久化请求和 Token？

-在 `httpapi/server.go` 的 `NewServer()` 中注册新路由。
+因为这个 GUI 已经是运维面板，不是一次性调试页。重启后仍然需要保留最近请求、日志和 usage 统计，便于排障和观察模型表现。

 ---

-*文档版本: 2025-04-25*
-*对应代码版本: 当前 master*
+## 8. 当前边界
+
+- IPC 模式仍然受本地 Lingma 插件运行态影响
+- Remote 登录态探测依赖本地 Lingma 缓存结构
+- 图片类请求在本地持久化时会做裁剪/脱敏，避免状态文件过大
+- Remote 模式下如果启用了 fallback，最近一次“聊天模型”可能与客户端最初指定模型不同
+
+---
+
+## 9. 代码入口建议
+
+如果要继续扩展，优先看这些文件：
+
+- `cmd/lingma-ipc-proxy/main.go`
+- `internal/httpapi/server.go`
+- `internal/service/service.go`
+- `internal/lingmaipc/*`
+- `internal/remote/*`
+- `desktop/app.go`
+- `desktop/main.go`
+
+---
+
+文档版本：2026-04-30