feat: bridge Lingma tool events to OpenAI/Anthropic responses
Add structured tool event propagation from Lingma stream/finish metadata and map it to OpenAI tool_calls and Anthropic tool_use/tool_result in both streaming and non-streaming responses. Add focused bridge tests and update docs/design notes to match current behavior. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
95
CLAUDE.md
Normal file
95
CLAUDE.md
Normal file
@@ -0,0 +1,95 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Primary docs to read first
|
||||
- `README.md` (runtime commands, env model, API examples)
|
||||
- `DESIGN.md` (architecture decisions, module boundaries, request lifecycle)
|
||||
- `.env.example` (authoritative env var reference)
|
||||
|
||||
No Cursor/Copilot rule files were found in this repo (`.cursorrules`, `.cursor/rules/`, `.github/copilot-instructions.md`).
|
||||
|
||||
## Common development commands
|
||||
|
||||
### Start locally
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
uvicorn app.main:app --reload --port 8317
|
||||
```
|
||||
|
||||
### Start with Docker Compose
|
||||
```bash
|
||||
cp .env.example .env
|
||||
mkdir -p data secrets
|
||||
docker compose up -d --build
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
### Run tests
|
||||
```bash
|
||||
# current focused suite
|
||||
python3 -m unittest tests/test_tool_call_bridge.py
|
||||
|
||||
# discover all unittest tests under tests/
|
||||
python3 -m unittest discover -s tests -p "test_*.py"
|
||||
|
||||
# run a single test method
|
||||
python3 -m unittest tests.test_tool_call_bridge.ToolCallBridgeTests.test_openai_non_stream_bridges_tool_calls
|
||||
```
|
||||
|
||||
### Smoke-check running gateway
|
||||
```bash
|
||||
API_KEY=$(grep '^API_KEYS=' .env | cut -d= -f2 | cut -d, -f1)
|
||||
curl -s http://127.0.0.1:8317/healthz
|
||||
curl -s http://127.0.0.1:8317/v1/models -H "Authorization: Bearer $API_KEY"
|
||||
```
|
||||
|
||||
### Linting/type-checking status
|
||||
- There is currently no repo-configured lint/type command (no `ruff`/`flake8`/`mypy` config found).
|
||||
- Do not invent tooling commands; if linting is needed, add tooling in a dedicated change first.
|
||||
|
||||
## Architecture (big picture)
|
||||
|
||||
### What this service is
|
||||
A FastAPI gateway that fronts Lingma and exposes:
|
||||
- OpenAI-compatible API (`/v1/models`, `/v1/chat/completions`)
|
||||
- Anthropic Messages-compatible API (`/v1/messages`, `/v1/messages/count_tokens`)
|
||||
|
||||
Both protocols share the same backend pool, backpressure guard, stats, and session reuse logic.
|
||||
|
||||
### Request lifecycle (important for most changes)
|
||||
1. Authenticate request (`app/auth.py`)
|
||||
2. Normalize inbound protocol payload to internal message shape (`openai_schema.py` / `anthropic_schema.py`)
|
||||
3. Session-cache lookup (`app/session_cache.py`) for prefix-based reuse
|
||||
4. Pick backend instance (`app/lingma_pool.py`) with affinity + least-in-flight
|
||||
5. Acquire concurrency ticket (`app/concurrency.py`)
|
||||
6. Call Lingma via websocket/LSP client (`app/lingma_client.py`)
|
||||
7. Map upstream result/stream back to wire protocol in `app/main.py`
|
||||
8. Record stats and release ticket (including stream-finally paths)
|
||||
|
||||
### Core module boundaries
|
||||
- `app/main.py`: API entrypoint + orchestration + wire-format adapters
|
||||
- `app/lingma_pool.py`: multi-instance lifecycle, selection, health-aware fallback
|
||||
- `app/lingma_client.py`: subprocess + LSP-over-WebSocket transport to Lingma
|
||||
- `app/session_cache.py`: LRU+TTL cache of conversation-prefix -> upstream session id (+ instance binding)
|
||||
- `app/concurrency.py`: in-flight guard and queue timeout/backpressure behavior
|
||||
- `app/stats.py`: usage counters and Prometheus text
|
||||
|
||||
### Protocol-specific notes
|
||||
- Anthropic and OpenAI endpoints are separate adapters over shared internals.
|
||||
- Response-side tool bridge is implemented: upstream Lingma tool events are surfaced as:
|
||||
- OpenAI: `tool_calls` (stream + non-stream)
|
||||
- Anthropic: `tool_use` / `tool_result` blocks (stream + non-stream)
|
||||
- Request-side `tools` / `tool_choice` are accepted by schemas but not forwarded to Lingma.
|
||||
|
||||
### Operational invariants to preserve
|
||||
- One request must stay on one Lingma instance for session continuity.
|
||||
- Session cache entries include instance identity; invalidate on unhealthy instance mismatch.
|
||||
- Streaming paths must always release in-flight tickets in `finally`.
|
||||
- Multi-instance mode must use isolated workdirs per instance.
|
||||
|
||||
### Deployment/runtime model
|
||||
- Container startup runs `python /app/app/bootstrap_lingma.py` before uvicorn.
|
||||
- Compose mounts:
|
||||
- `./data -> /app/data` (persistent Lingma binary/cache/workdirs)
|
||||
- `./secrets -> /secrets:ro` (session bundles, secrets)
|
||||
Reference in New Issue
Block a user