diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..a03e902 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,25 @@ +# Changelog + +## v1.4.2 - 2026-04-30 + +- Default backend changed to remote API mode for new CLI and desktop configurations. +- Default model changed to `kmodel` (`Kimi-K2.6` in Lingma remote model list). +- Removed the proxy-injected fake `Auto` model in remote mode so the model list only shows models returned by Lingma. +- Fixed Dashboard recent requests showing `MiniMax-M2.7` for model discovery and health/debug requests that do not contain a model field. +- Added request record model and payload size fields for the desktop app request table. +- Updated Dashboard transport display to show `Remote API` when remote backend is active. +- Updated Hermes local config to use Lingma Proxy with `kmodel` and remote model IDs. +- Updated README / README.zh-CN for remote-first mode, Kimi recommendation, package selection, protocol support, and debug/log endpoints. + +## v1.4.1 - 2026-04-30 + +- Improved remote enterprise endpoint detection from Lingma logs. +- Added support for showing detected remote base URL and credential source in desktop Settings. +- Added macOS DMG packaging in GitHub Actions. + +## v1.4.0 - 2026-04-30 + +- Added experimental remote API backend alongside the original IPC plugin backend. +- Added remote credential import from local Lingma login cache or explicit credential files. +- Added OpenAI / Anthropic compatible routing over the remote backend. +- Added request and log debug endpoints for troubleshooting. diff --git a/README.md b/README.md index 86221d4..b17759e 100644 --- a/README.md +++ b/README.md @@ -8,12 +8,14 @@ The project is designed for tools such as Claude Code, Cline, Continue, OpenCode The proxy now supports two backend modes: -- **IPC plugin mode (default)**: connects to the local Lingma IDE plugin over WebSocket / Named Pipe. This is the safest daily mode and keeps behavior closest to the IDE plugin. -- **Remote API mode (experimental)**: imports the local Lingma login cache or an explicit credential file and calls Lingma remote APIs directly. This can feel more like an official API and does not depend on an IDE IPC session, but it relies on non-public login and signing details that may change. +- **Remote API mode (default, experimental)**: imports the local Lingma login cache or an explicit credential file and calls Lingma remote APIs directly. This feels more like an official API, does not depend on an IDE IPC session, and is currently the recommended mode for Claude Code / Hermes style agents. +- **IPC plugin mode**: connects to the local Lingma IDE plugin over WebSocket / Named Pipe. This keeps behavior closest to the IDE plugin and is useful as a compatibility fallback. ## Current Version -The current desktop line is `v1.4.1`. +The current desktop line is `v1.4.2`. + +See [CHANGELOG.md](./CHANGELOG.md) for release history. Release builds are produced by GitHub Actions for: @@ -169,17 +171,7 @@ lingma-ipc-proxy --transport pipe --pipe '\\.\pipe\lingma-ipc' ## Backend Modes -### IPC Plugin Mode (Default) - -IPC mode talks to the local Lingma IDE plugin: - -```bash -lingma-ipc-proxy --backend ipc --transport auto --port 8095 -``` - -Use this when VS Code / the Lingma plugin is already running, when you want plugin session behavior, or when you want the model list exposed by the local plugin. - -### Remote API Mode (Experimental) +### Remote API Mode (Default, Experimental) Remote mode calls Lingma's remote API directly: @@ -231,6 +223,16 @@ Notes: - Local validation passed `/health`, `/v1/models`, OpenAI streaming/non-streaming chat, and Claude Code Anthropic + Bash tool use. Claude Code full tool runs are much slower than simple OpenAI requests because the client sends a large context and performs a second tool-result turn. - This mode is inspired by the remote API and credential-signing research in [ZipperCode/lingma2api](https://github.com/ZipperCode/lingma2api), integrated here as a switchable backend under the existing OpenAI / Anthropic / desktop app architecture. +### IPC Plugin Mode + +IPC mode talks to the local Lingma IDE plugin: + +```bash +lingma-ipc-proxy --backend ipc --transport auto --port 8095 +``` + +Use this when VS Code / the Lingma plugin is already running, when you want plugin session behavior, or when you want the model list exposed by the local plugin. + ## Quick Start ### Desktop App @@ -270,7 +272,7 @@ export ANTHROPIC_API_KEY="any" Then select a model in Claude Code: ```text -/model MiniMax-M2.7 +/model kmodel ``` ### Cline @@ -278,7 +280,7 @@ Then select a model in Claude Code: - Provider: `OpenAI Compatible` - Base URL: `http://127.0.0.1:8095/v1` - API Key: `any` -- Model ID: `MiniMax-M2.7` +- Model ID: `kmodel` ### Continue @@ -288,7 +290,7 @@ Then select a model in Claude Code: { "title": "Lingma Proxy", "provider": "openai", - "model": "MiniMax-M2.7", + "model": "kmodel", "apiKey": "any", "apiBase": "http://127.0.0.1:8095/v1" } @@ -316,13 +318,13 @@ The proxy only reports models actually exposed by your Lingma plugin. The table | Model | Best use | Context / capability basis | | --- | --- | --- | -| `MiniMax-M2.7` | Default recommendation for third-party agents | NVIDIA's [MiniMax M2.7 model card](https://developer.nvidia.com/blog/minimax-m2-7-advances-scalable-agentic-workflows-on-nvidia-platforms-for-complex-ai-applications/) describes a language MoE model with 200K input context and agentic use cases; local proxy testing passed read/search/terminal/web/patch/vision smoke tests. | -| `Kimi-K2.6` | Multimodal and long-context agent work | Kimi's [official API docs](https://platform.kimi.ai/docs/guide/kimi-k2-6-quickstart) describe native text/image/video input, a 256K context window, and multi-step tool invocation support. | +| `Kimi-K2.6` (`kmodel` in remote mode) | Default recommendation for remote API mode and third-party agents | Kimi's [official API docs](https://platform.kimi.ai/docs/guide/kimi-k2-6-quickstart) describe native text/image/video input, a 256K context window, and multi-step tool invocation support. Local Claude Code testing showed cleaner native tool execution in remote mode. | +| `MiniMax-M2.7` (`mmodel` in remote mode) | Fast fallback | NVIDIA's [MiniMax M2.7 model card](https://developer.nvidia.com/blog/minimax-m2-7-advances-scalable-agentic-workflows-on-nvidia-platforms-for-complex-ai-applications/) describes a language MoE model with 200K input context and agentic use cases; local proxy testing passed read/search/terminal/web/patch/vision smoke tests and was fast in previous runs. | | `Qwen3-Coder` | Code-specialized fallback | Qwen's [official blog](https://qwenlm.github.io/blog/qwen3-coder/) describes 256K native context, up to 1M with extrapolation, and agentic coding/tool protocols. | | `Qwen3.6-Plus` | General/vision fallback | Exposed by Lingma and passed local smoke tests, but this repository does not have an official Lingma-specific context-length source for it. | | `Qwen3-Max` | Fast general/vision model | Exposed by Lingma and strong in simple tests, but less stable on forced edit/read tool calls in this proxy. | -Default model when the client omits `model`: `MiniMax-M2.7`. +Default model when the client omits `model`: `kmodel` (`Kimi-K2.6` in the remote model list). ## Configuration @@ -393,7 +395,7 @@ Current proxy hardening includes: - common tool alias normalization such as `Bash` -> `terminal`, `Read` -> `read_file`, `Grep` -> `search_files`, and `Edit` -> `patch` - Anthropic `stream=true` requests with tools are resolved internally before streaming the final `tool_use` blocks, which avoids sending premature "please run this command yourself" text to clients such as Claude Code. -In local smoke tests after this hardening, `MiniMax-M2.7`, `Kimi-K2.6`, `Qwen3.6-Plus`, and `Qwen3-Coder` all completed read/search/terminal/web/patch/vision checks, with `MiniMax-M2.7` having the lowest average latency in the tested set. +In local smoke tests after this hardening, `MiniMax-M2.7`, `Kimi-K2.6`, `Qwen3.6-Plus`, and `Qwen3-Coder` all completed read/search/terminal/web/patch/vision checks. Remote API mode with `kmodel` is now the default because it avoids Lingma IDE IPC session limits and behaved better with Claude Code and Hermes-style local tools. ## Request And Log Inspection diff --git a/README.zh-CN.md b/README.zh-CN.md index bab1d81..288d1e5 100644 --- a/README.zh-CN.md +++ b/README.zh-CN.md @@ -11,12 +11,14 @@ 代理后端支持两种模式: -- **IPC 插件模式(默认)**:连接本机 Lingma IDE 插件的 WebSocket / Named Pipe。优点是更接近 IDE 插件上下文,适合日常稳定使用。 -- **远端 API 模式(实验)**:读取 Lingma 本地登录缓存或显式凭据,直接调用 Lingma 远端接口。优点是不依赖 IDE 插件窗口和 IPC 会话,体验更像官方 API;缺点是依赖本地登录态字段和非公开接口,未来可能失效。 +- **远端 API 模式(默认,实验)**:读取 Lingma 本地登录缓存或显式凭据,直接调用 Lingma 远端接口。优点是不依赖 IDE 插件窗口和 IPC 会话,体验更像官方 API;目前更推荐给 Claude Code / Hermes 这类本地 Agent。 +- **IPC 插件模式**:连接本机 Lingma IDE 插件的 WebSocket / Named Pipe。优点是更接近 IDE 插件上下文,适合作为兼容性兜底。 ## 当前版本 -当前桌面端版本线:`v1.4.1` +当前桌面端版本线:`v1.4.2` + +版本更新记录见 [CHANGELOG.md](./CHANGELOG.md)。 GitHub Actions 会在 Release 中产出: @@ -235,17 +237,7 @@ lingma-ipc-proxy --transport pipe --pipe '\\.\pipe\lingma-ipc' ## 后端模式 -### IPC 插件模式(默认) - -IPC 模式通过本机 Lingma IDE 插件通信: - -```bash -lingma-ipc-proxy --backend ipc --transport auto --port 8095 -``` - -适合已经打开 VS Code / Lingma 插件、希望使用插件当前会话环境、并优先使用插件探测模型列表的场景。 - -### 远端 API 模式(实验) +### 远端 API 模式(默认,实验) 远端模式直接调用 Lingma 远端接口: @@ -297,6 +289,16 @@ lingma-ipc-proxy \ - 当前本机实测:`/health`、`/v1/models`、OpenAI 流式 / 非流式、Claude Code Anthropic + Bash 工具调用均可用;Claude Code 完整工具链耗时明显高于简单 OpenAI 请求。 - 该模式参考了 [ZipperCode/lingma2api](https://github.com/ZipperCode/lingma2api) 对 Lingma 远端接口、签名和登录态结构的探索,本仓库将其作为可切换后端集成到现有 OpenAI / Anthropic / 桌面 App 架构中。 +### IPC 插件模式 + +IPC 模式通过本机 Lingma IDE 插件通信: + +```bash +lingma-ipc-proxy --backend ipc --transport auto --port 8095 +``` + +适合已经打开 VS Code / Lingma 插件、希望使用插件当前会话环境、并优先使用插件探测模型列表的场景。 + ## 快速开始 ### 前置条件 @@ -349,7 +351,7 @@ export ANTHROPIC_API_KEY="any" 然后在 Claude Code 中选择模型: ```text -/model MiniMax-M2.7 +/model kmodel ``` ### Cline @@ -358,7 +360,7 @@ export ANTHROPIC_API_KEY="any" - Base URL:`http://127.0.0.1:8095/v1` - API Key:`any` -- Model ID:`MiniMax-M2.7` +- Model ID:`kmodel` ### Continue @@ -368,7 +370,7 @@ export ANTHROPIC_API_KEY="any" { "title": "Lingma Proxy", "provider": "openai", - "model": "MiniMax-M2.7", + "model": "kmodel", "apiKey": "any", "apiBase": "http://127.0.0.1:8095/v1" } @@ -390,7 +392,7 @@ export ANTHROPIC_API_KEY="any" | `Qwen3-Thinking` | 推理类模型 | | `Qwen3.6-Plus` | 通用模型 | | `Kimi-K2.6` | 多模态和长上下文模型 | -| `MiniMax-M2.7` | 第三方 Agent 默认推荐 | +| `MiniMax-M2.7` | 速度优先备选 | ### 模型参数来源和推荐 @@ -398,13 +400,13 @@ export ANTHROPIC_API_KEY="any" | 模型 | 推荐场景 | 参数 / 能力依据 | | --- | --- | --- | -| `MiniMax-M2.7` | 默认推荐给 OpenClaw / Hermes / Claude Code / Cline 这类第三方 Agent | NVIDIA 的 [MiniMax M2.7 模型卡](https://developer.nvidia.com/blog/minimax-m2-7-advances-scalable-agentic-workflows-on-nvidia-platforms-for-complex-ai-applications/) 标注 200K input context、MoE 语言模型和 agentic 场景;本地代理压测 read/search/terminal/web/patch/vision 全部通过,平均延迟最低。 | -| `Kimi-K2.6` | 多模态、长上下文、复杂 Agent 工作流 | Kimi [官方 API 文档](https://platform.kimi.ai/docs/guide/kimi-k2-6-quickstart) 标注原生 text/image/video、多步工具调用和 256K 上下文。 | +| `Kimi-K2.6`(远端模式 ID 为 `kmodel`) | 远端 API 模式和第三方 Agent 默认推荐 | Kimi [官方 API 文档](https://platform.kimi.ai/docs/guide/kimi-k2-6-quickstart) 标注原生 text/image/video、多步工具调用和 256K 上下文。本地 Claude Code 远端模式测试里工具执行更自然。 | +| `MiniMax-M2.7`(远端模式 ID 为 `mmodel`) | 速度优先备选 | NVIDIA 的 [MiniMax M2.7 模型卡](https://developer.nvidia.com/blog/minimax-m2-7-advances-scalable-agentic-workflows-on-nvidia-platforms-for-complex-ai-applications/) 标注 200K input context、MoE 语言模型和 agentic 场景;此前本地代理压测 read/search/terminal/web/patch/vision 全部通过,响应速度较快。 | | `Qwen3-Coder` | 代码专项和工具协议备选 | Qwen [官方博客](https://qwenlm.github.io/blog/qwen3-coder/) 标注 256K 原生上下文、可扩展到 1M,以及 agentic coding / function calling 协议。 | | `Qwen3.6-Plus` | 通用 / 视觉备选 | Lingma 暴露且本地实测可用,但本仓库没有找到 Lingma 专属的官方上下文长度来源。 | | `Qwen3-Max` | 快速通用 / 视觉备选 | 简单工具和视觉测试表现好,但强制 read/patch 场景在本代理里不如 MiniMax / Kimi 稳。 | -当客户端请求没有携带 `model` 字段时,代理默认使用:`MiniMax-M2.7`。 +当客户端请求没有携带 `model` 字段时,代理默认使用:`kmodel`(远端模型列表里的 Kimi-K2.6)。 ## 配置文件 @@ -482,7 +484,7 @@ Lingma 插件本身没有公开标准 OpenAI / Anthropic Tools 协议,所以 - 自动归一化常见工具名别名:`Bash`、`Shell`、`Read`、`Grep`、`Edit`、`Fetch` 等。 - Anthropic `stream=true` 且请求包含 tools 时,会先内部完成生成和重试,再流式输出最终 `tool_use` 事件,避免 Claude Code 这类客户端先收到普通拒绝文本。 -本地压测结果:`MiniMax-M2.7`、`Kimi-K2.6`、`Qwen3.6-Plus`、`Qwen3-Coder` 均通过 read/search/terminal/web/patch/vision 烟测;其中 `MiniMax-M2.7` 平均延迟最低,所以作为默认推荐。 +本地压测结果:`MiniMax-M2.7`、`Kimi-K2.6`、`Qwen3.6-Plus`、`Qwen3-Coder` 均通过 read/search/terminal/web/patch/vision 烟测。当前默认推荐远端 API 模式的 `kmodel`,因为它不受 Lingma IDE IPC 会话限制,在 Claude Code 和 Hermes 这类本地 Agent 场景更自然。 ## 请求和日志观测 diff --git a/cmd/lingma-ipc-proxy/main.go b/cmd/lingma-ipc-proxy/main.go index 552a217..622eece 100644 --- a/cmd/lingma-ipc-proxy/main.go +++ b/cmd/lingma-ipc-proxy/main.go @@ -91,11 +91,11 @@ func loadConfig() (service.Config, string) { cfg := service.Config{ Host: "127.0.0.1", Port: 8095, - Backend: service.BackendIPC, + Backend: service.BackendRemote, Transport: lingmaipc.TransportAuto, Cwd: currentDir(), Mode: "agent", - Model: "MiniMax-M2.7", + Model: "kmodel", ShellType: defaultShellType(), SessionMode: service.SessionModeAuto, Timeout: 120 * time.Second, @@ -303,7 +303,9 @@ func parseSessionMode(value string) service.SessionMode { func parseBackend(value string) service.BackendMode { mode := service.BackendMode(strings.ToLower(strings.TrimSpace(value))) switch mode { - case "", service.BackendIPC: + case "": + return service.BackendRemote + case service.BackendIPC: return service.BackendIPC case service.BackendRemote: return service.BackendRemote diff --git a/config.example.json b/config.example.json index a140393..8da5745 100644 --- a/config.example.json +++ b/config.example.json @@ -1,9 +1,11 @@ { "host": "127.0.0.1", "port": 8095, + "backend": "remote", "transport": "auto", "mode": "chat", - "session_mode": "reuse", + "model": "kmodel", + "session_mode": "auto", "timeout": 120, "cwd": "C:/Workspace/Personal/lingma-ipc-proxy", "shell_type": "powershell", diff --git a/desktop/app.go b/desktop/app.go index 6badec5..d7286e7 100644 --- a/desktop/app.go +++ b/desktop/app.go @@ -28,8 +28,10 @@ type RequestRecord struct { Time string `json:"time"` Method string `json:"method"` Path string `json:"path"` + Model string `json:"model,omitempty"` StatusCode int `json:"statusCode"` Duration string `json:"duration"` + Size string `json:"size,omitempty"` ReqBody string `json:"reqBody,omitempty"` RespBody string `json:"respBody,omitempty"` } @@ -385,8 +387,10 @@ func (a *App) StartProxy() error { Time: time.Now().Format("15:04:05"), Method: method, Path: path, + Model: extractRequestModel(reqBody), StatusCode: statusCode, Duration: duration.Round(time.Millisecond).String(), + Size: formatPayloadSize(len(reqBody) + len(respBody)), ReqBody: reqBody, RespBody: respBody, }) @@ -581,15 +585,44 @@ func (a *App) fetchModels(addr string) ([]ModelInfo, error) { return models, nil } +func extractRequestModel(reqBody string) string { + if strings.TrimSpace(reqBody) == "" { + return "" + } + var payload map[string]any + if err := json.Unmarshal([]byte(reqBody), &payload); err != nil { + return "" + } + if model, ok := payload["model"].(string); ok { + return strings.TrimSpace(model) + } + return "" +} + +func formatPayloadSize(bytes int) string { + if bytes <= 0 { + return "-" + } + const kb = 1024 + const mb = 1024 * kb + if bytes >= mb { + return fmt.Sprintf("%.1f MB", float64(bytes)/float64(mb)) + } + if bytes >= kb { + return fmt.Sprintf("%.1f KB", float64(bytes)/float64(kb)) + } + return fmt.Sprintf("%d B", bytes) +} + func defaultConfig() service.Config { cfg := service.Config{ Host: "127.0.0.1", Port: 8095, - Backend: service.BackendIPC, + Backend: service.BackendRemote, Transport: lingmaipc.TransportAuto, Cwd: defaultCwd(), Mode: "agent", - Model: "MiniMax-M2.7", + Model: "kmodel", ShellType: defaultShellType(), SessionMode: service.SessionModeAuto, Timeout: 120 * time.Second, diff --git a/desktop/frontend/src/App.vue b/desktop/frontend/src/App.vue index fea40aa..c999252 100644 --- a/desktop/frontend/src/App.vue +++ b/desktop/frontend/src/App.vue @@ -222,7 +222,7 @@ onUnmounted(() => {
推荐 OpenClaw / Hermes / Claude Code / Cline 优先选择 MiniMax-M2.7。
+远端 API 模式推荐 Kimi-K2.6;MiniMax-M2.7 可作为速度优先备选。