Files
lingma-openai-gateway/CLAUDE.md
GitHub Actions 1c7b86e2c0 feat: bridge Lingma tool events to OpenAI/Anthropic responses
Add structured tool event propagation from Lingma stream/finish metadata and map it to OpenAI tool_calls and Anthropic tool_use/tool_result in both streaming and non-streaming responses. Add focused bridge tests and update docs/design notes to match current behavior.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 22:34:43 +08:00

3.9 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Primary docs to read first

  • README.md (runtime commands, env model, API examples)
  • DESIGN.md (architecture decisions, module boundaries, request lifecycle)
  • .env.example (authoritative env var reference)

No Cursor/Copilot rule files were found in this repo (.cursorrules, .cursor/rules/, .github/copilot-instructions.md).

Common development commands

Start locally

pip install -r requirements.txt
uvicorn app.main:app --reload --port 8317

Start with Docker Compose

cp .env.example .env
mkdir -p data secrets
docker compose up -d --build
docker compose logs -f

Run tests

# current focused suite
python3 -m unittest tests/test_tool_call_bridge.py

# discover all unittest tests under tests/
python3 -m unittest discover -s tests -p "test_*.py"

# run a single test method
python3 -m unittest tests.test_tool_call_bridge.ToolCallBridgeTests.test_openai_non_stream_bridges_tool_calls

Smoke-check running gateway

API_KEY=$(grep '^API_KEYS=' .env | cut -d= -f2 | cut -d, -f1)
curl -s http://127.0.0.1:8317/healthz
curl -s http://127.0.0.1:8317/v1/models -H "Authorization: Bearer $API_KEY"

Linting/type-checking status

  • There is currently no repo-configured lint/type command (no ruff/flake8/mypy config found).
  • Do not invent tooling commands; if linting is needed, add tooling in a dedicated change first.

Architecture (big picture)

What this service is

A FastAPI gateway that fronts Lingma and exposes:

  • OpenAI-compatible API (/v1/models, /v1/chat/completions)
  • Anthropic Messages-compatible API (/v1/messages, /v1/messages/count_tokens)

Both protocols share the same backend pool, backpressure guard, stats, and session reuse logic.

Request lifecycle (important for most changes)

  1. Authenticate request (app/auth.py)
  2. Normalize inbound protocol payload to internal message shape (openai_schema.py / anthropic_schema.py)
  3. Session-cache lookup (app/session_cache.py) for prefix-based reuse
  4. Pick backend instance (app/lingma_pool.py) with affinity + least-in-flight
  5. Acquire concurrency ticket (app/concurrency.py)
  6. Call Lingma via websocket/LSP client (app/lingma_client.py)
  7. Map upstream result/stream back to wire protocol in app/main.py
  8. Record stats and release ticket (including stream-finally paths)

Core module boundaries

  • app/main.py: API entrypoint + orchestration + wire-format adapters
  • app/lingma_pool.py: multi-instance lifecycle, selection, health-aware fallback
  • app/lingma_client.py: subprocess + LSP-over-WebSocket transport to Lingma
  • app/session_cache.py: LRU+TTL cache of conversation-prefix -> upstream session id (+ instance binding)
  • app/concurrency.py: in-flight guard and queue timeout/backpressure behavior
  • app/stats.py: usage counters and Prometheus text

Protocol-specific notes

  • Anthropic and OpenAI endpoints are separate adapters over shared internals.
  • Response-side tool bridge is implemented: upstream Lingma tool events are surfaced as:
    • OpenAI: tool_calls (stream + non-stream)
    • Anthropic: tool_use / tool_result blocks (stream + non-stream)
  • Request-side tools / tool_choice are accepted by schemas but not forwarded to Lingma.

Operational invariants to preserve

  • One request must stay on one Lingma instance for session continuity.
  • Session cache entries include instance identity; invalidate on unhealthy instance mismatch.
  • Streaming paths must always release in-flight tickets in finally.
  • Multi-instance mode must use isolated workdirs per instance.

Deployment/runtime model

  • Container startup runs python /app/app/bootstrap_lingma.py before uvicorn.
  • Compose mounts:
    • ./data -> /app/data (persistent Lingma binary/cache/workdirs)
    • ./secrets -> /secrets:ro (session bundles, secrets)