lingma-openai-gateway

Author	SHA1	Message	Date
GitHub Actions	15cd5e8770	fix: close forced tool-choice with structured fallback Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-20 07:18:01 +08:00
GitHub Actions	63583712a8	fix: fallback agent payload source to numeric value Keep Lingma chat/ask payload source as numeric 1 for agent mode A/B validation against remote upstream timeout behavior. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-20 06:36:07 +08:00
GitHub Actions	c67a9c3d61	fix: align agent payload semantics with VSCode tool flow Force OpenAI tooling-context requests into agent mode and align Lingma ask payload fields for agent requests so server-side tool path matches VSCode semantics. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 23:19:52 +08:00
GitHub Actions	e208025f35	fix: emit Lingma tool approve/invoke roundtrip Forward tool/call/sync and tool/invoke events to Lingma with auto-approve and invokeResult so tool calls can complete end-to-end. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 21:35:05 +08:00
GitHub Actions	3498b81fa2	fix: enable anthropic agent mode for tooling requests Use agent ask_mode for Anthropic messages with tooling context so tool/write flows are executed, and add regression coverage plus docs/env updates for TOOL_FORWARD_ENABLED. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 20:15:14 +08:00
GitHub Actions	e600bae27c	fix: harden tooling session reuse and event routing Ensure session reuse is disabled for tooling contexts, include tool config in cache keys, and stabilize tool event merge/routing with expanded bridge tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 19:29:30 +08:00
GitHub Actions	5aa7fbfae5	fix: align Lingma tool event lifecycle handling Handle tool/invokeResult and richer tool/call/sync payloads in the client, and document/retest the verified VSCode monitoring workflow for tool events. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 09:49:01 +08:00
GitHub Actions	1c7b86e2c0	feat: bridge Lingma tool events to OpenAI/Anthropic responses Add structured tool event propagation from Lingma stream/finish metadata and map it to OpenAI tool_calls and Anthropic tool_use/tool_result in both streaming and non-streaming responses. Add focused bridge tests and update docs/design notes to match current behavior. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 22:34:43 +08:00
GitHub Actions	b3fd8800f7	fix: align Anthropic endpoints for Claude Code compatibility Add /v1/messages/count_tokens and switch /v1/models to Anthropic-style key auth so Claude Code probes succeed consistently. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 20:05:24 +08:00
GitHub Actions	0b08dc6573	feat: Anthropic Messages API compat (/v1/messages) Add a wire-compatible Anthropic endpoint alongside the existing OpenAI one so Claude Code / anthropic-sdk / Cursor Agent can hit Lingma directly. - app/anthropic_schema.py (new): request model + content-block flattener + internal-messages adapter + affinity key helper. Handles text / image / tool_use / tool_result blocks; unknown types degrade gracefully. - app/auth.py: add require_anthropic_key (x-api-key, Bearer fallback) and AnthropicAuthError so auth failures render in Anthropic's error envelope instead of FastAPI's {detail:...} wrapper. - app/main.py: POST /v1/messages. Shares LingmaPool / SessionCache / InFlightGuard / StatsCollector with the OpenAI path — same api_key + same conversation prefix hits the same upstream sessionId across both protocols (KV cache carries over). Streaming emits the named Anthropic event sequence (message_start / content_block_start / content_block_delta / content_block_stop / message_delta / message_stop). No claude-* model mapping table: resolve_model's default fallback handles it. - README.md / DESIGN.md: document the new endpoint, add decision 5.12, iteration history M5, and a 4.3b streaming flow diagram. - Bump FastAPI app version to 0.4.0. Made-with: Cursor	2026-04-18 15:40:43 +08:00
GitHub Actions	2febc37c2c	prod hardening: admin/metrics authz split, subprocess lifecycle, parallel pool start, HEALTHCHECK - authz: new ADMIN_TOKEN gates /internal/*; METRICS_PUBLIC=false by default, so /metrics returns 503 when neither METRICS_TOKEN nor API_KEYS is set (previously leaked pool topology). Startup logs loudly if API_KEYS is empty or admin falls back to chat keys. - lingma_client: keep a Popen handle instead of orphaning Lingma with start_new_session, drain stderr to logger at DEBUG, SIGTERM -> 5s grace -> SIGKILL on shutdown. Fixes the zombie-process leak on container reload. - pool: asyncio.gather to start N instances concurrently; N=2 pool shaves ~startup_timeout seconds off boot. - Dockerfile: HEALTHCHECK hits /healthz and greps for pool_ready>0 so Docker / compose orchestrators see "stuck on login" as unhealthy. Made-with: Cursor	2026-04-18 10:22:13 +08:00
GitHub Actions	4e08d1af36	feat: session bundle import/export to skip Playwright auto-login Adds a lightweight way to pre-seed a Lingma workDir with an existing logged-in session: - New module session_bundle.py packs/unpacks only the four cache files that make up a Lingma login (id, user, quota, config.json). Everything else (db, logs, index, diagnosis) stays local so bundles stay tiny and never leak session-specific artefacts. - Safety: path-traversal/symlink members are rejected; size is capped; refuses to export from a workDir that isn't actually logged in; sensitive cache/user is chmod'd 0600 on restore. - LingmaAccount gains optional session_bundle_b64 / session_bundle_file; LINGMA_SESSION_BUNDLE[_FILE] env provide the singleton fallback. Credentials become optional when a bundle is supplied. - LingmaPool.start() restores the bundle into each instance workDir only if it isn't already logged in, so persistent volumes aren't clobbered and a corrupt bundle falls back to Playwright gracefully. - POST /internal/session/export returns the bundle as base64; ?instance= selects a specific pool instance. Requires an authed, already-logged-in instance to prevent exporting empties. - README + .env.example document the end-to-end flow. Made-with: Cursor	2026-04-18 09:39:58 +08:00
GitHub Actions	ba865f3be0	feat: expose /internal/models/raw for authoritative model metadata Lets callers see Lingma's raw config/queryModels response, so the official per-key displayName/description is discoverable without reverse-engineering the VSIX. Falls back to the pool's pick() unless a specific instance is requested. Made-with: Cursor	2026-04-18 09:29:11 +08:00
GitHub Actions	dfdb7087dc	perf: session reuse for multi-turn latency - Add SessionCache (LRU + TTL, per-API-key scoped) mapping conversation-prefix hash -> upstream Lingma sessionId. - Hash only user/system/developer turns so client-side assistant reformatting doesn't invalidate the key. - On cache hit: reuse sessionId, send only the latest user message with isReply=true, and stick the request to the instance that originally served it. - LingmaGatewayClient.chat_complete/chat_stream accept session_id/is_reply and report the real finish.sessionId via out_meta so we persist what Lingma actually allocated. - Invalidate cache on non-stream failure; skip writes on cancelled/partial streams. - Expose cache stats in /internal/stats and /metrics. - Configurable via SESSION_REUSE_ENABLED / SESSION_CACHE_MAX_ENTRIES / SESSION_CACHE_TTL_SEC (documented in README + .env.example). Made-with: Cursor	2026-04-18 08:10:39 +08:00
GitHub Actions	d209d8ac0b	perf: stop blocking on chat/ask RPC timeout (fixes ~30s TTFB) Lingma streams answers via chat/answer + chat/finish notifications and never sends a JSON-RPC response for chat/ask. The old code awaited rpc.request("chat/ask") and swallowed the TimeoutError, so every chat was forced to wait the full rpc_timeout (default 30s) before draining the stream queue - even though the first token was already present in the queue within ~2s. Effect: - non-stream TTFB dropped from ~30s to actual upstream latency (~2-3s). - stream first-chunk dropped from ~30s to upstream first-token latency. - consume_stream idle timeout decoupled from rpc_timeout so shortening rpc_timeout no longer starves long completions. Switch chat/ask to rpc.notify (fire-and-forget) and rely entirely on the existing chat/answer + chat/finish handlers for result delivery. Made-with: Cursor	2026-04-18 07:54:45 +08:00
GitHub Actions	707acc9005	feat: M1+M2 gateway hardening and multi-instance pool Behavior hardening (M1): - Fix `_chat_streams` memory leak: pop_stream on completion, error, and client disconnect. - Add WebSocket reconnect with state machine (stopped/starting/ready/ reconnecting/failed/closed) and exponential backoff, so a Lingma restart no longer requires restarting the gateway. - Lazy initialization: startup failure is non-fatal, first real request triggers retry, `/healthz` reflects readiness. - Migrate FastAPI on_event to lifespan. - Structured JSON logging with request_id ContextVar; `x-request-id` propagated to responses. - SSE now sets `Cache-Control: no-cache`, `X-Accel-Buffering: no` to defeat proxy buffering. - OpenAI schema compatibility: `content` accepts str \| list[parts] \| None, added `developer`/`function` roles, `tools/tool_choice/stream_options/ user/max_tokens` fields, and `stream_options.include_usage` emits final usage chunk. - `require_bearer` uses `hmac.compare_digest`; `/metrics` now requires Bearer when `METRICS_TOKEN` or `API_KEYS` are set. - Python 3.10/3.11 `TimeoutError` vs `asyncio.TimeoutError` unified. - Error responses no longer leak `auto_login.status()` details. Backpressure (M2 / A2): - New `InFlightGuard` with per-request ticket, queue + rejection accounting, `BackpressureRejected` raises 429 + `Retry-After` once `GATEWAY_QUEUE_TIMEOUT_SEC` elapses. - Streaming ticket ownership transfers to the generator so CancelledError from client disconnect still releases the slot. - `/internal/stats.concurrency` and `/metrics` expose in_flight/queued/ accepted_total/rejected_total/max_in_flight. Multi-instance pool (M2 / A1 + B3): - New `LingmaPool` with N processes, each with its own workDir, socket port (dynamic when N>1), and `AutoLoginManager`. - Account parser supports CSV (`u1:p1,u2:p2`) and JSON formats via `LINGMA_ACCOUNTS`; falls back to `LINGMA_USERNAME/LINGMA_PASSWORD` for backwards compatibility (N=1 keeps legacy paths/ports). - Routing: sticky affinity by `user` / system-prompt hash, then least-in-flight, finally round-robin fallback for unhealthy pool. - `/healthz` reports per-instance state and ready count. - `/internal/stats.pool` and `/metrics` expose per-instance `gateway_pool_instance_in_flight{name}` / `gateway_pool_instance_ready{name}`. - `/internal/auto-login/start?instance=inst-N` targets a specific instance; `/internal/auto-login/status` lists all instances. Compat notes: - `.env.example` adds `METRICS_TOKEN`, `LOG_LEVEL`, `GATEWAY_MAX_IN_FLIGHT`, `GATEWAY_QUEUE_TIMEOUT_SEC`, `LINGMA_ACCOUNTS`, `LINGMA_INSTANCE_COUNT`. - `.gitignore` cleaned up data/ duplication. - Existing single-instance deployments keep working without config change. Made-with: Cursor	2026-04-18 07:40:32 +08:00
root	c1e261aa14	refactor: move runtime state under project data directory Some checks failed CI / lint-and-compile (push) Has been cancelled Details CI / lint-and-compile (pull_request) Has been cancelled Details	2026-04-17 15:57:51 +08:00
root	e41ee8bcc8	fix: treat 200 login API response as provisional success Some checks failed CI / lint-and-compile (push) Has been cancelled Details CI / lint-and-compile (pull_request) Has been cancelled Details	2026-04-17 15:43:57 +08:00
root	d12668201f	fix: capture login API via response event listener Some checks failed CI / lint-and-compile (push) Has been cancelled Details CI / lint-and-compile (pull_request) Has been cancelled Details	2026-04-17 15:40:25 +08:00
root	0c9fdd53c9	fix: verify /users/ajax/login success in auto login flow Some checks failed CI / lint-and-compile (push) Has been cancelled Details CI / lint-and-compile (pull_request) Has been cancelled Details	2026-04-17 15:31:02 +08:00
root	5f0c1866a6	fix: harden auto-login selectors and poll auth status Some checks failed CI / lint-and-compile (push) Has been cancelled Details CI / lint-and-compile (pull_request) Has been cancelled Details	2026-04-17 14:40:28 +08:00
root	b621c4aca7	feat: bootstrap Lingma from latest marketplace VSIX Some checks failed CI / lint-and-compile (push) Has been cancelled Details CI / lint-and-compile (pull_request) Has been cancelled Details	2026-04-17 10:44:37 +08:00
root	5526779e98	chore: initialize clean history without secrets Some checks failed CI / lint-and-compile (push) Has been cancelled Details	2026-04-17 09:56:08 +08:00

23 Commits