Ensure /v1/responses streams always terminate with response.completed and normalize Lingma tool_code fallbacks into structured tool calls, including single-argument forms. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
273 lines
11 KiB
Markdown
273 lines
11 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Primary docs to read first
|
|
- `README.md` (runtime commands, env model, API examples)
|
|
- `DESIGN.md` (architecture decisions, module boundaries, request lifecycle)
|
|
- `.env.example` (authoritative env var reference)
|
|
|
|
No Cursor/Copilot rule files were found in this repo (`.cursorrules`, `.cursor/rules/`, `.github/copilot-instructions.md`).
|
|
|
|
## Common development commands
|
|
|
|
### Start locally
|
|
```bash
|
|
pip install -r requirements.txt
|
|
uvicorn app.main:app --reload --port 8317
|
|
```
|
|
|
|
### Start with Docker Compose
|
|
```bash
|
|
cp .env.example .env
|
|
mkdir -p data secrets
|
|
docker compose up -d --build
|
|
docker compose logs -f
|
|
```
|
|
|
|
### Run tests
|
|
```bash
|
|
# current focused suite
|
|
python3 -m unittest tests/test_tool_call_bridge.py
|
|
|
|
# discover all unittest tests under tests/
|
|
python3 -m unittest discover -s tests -p "test_*.py"
|
|
|
|
# run a single test method
|
|
python3 -m unittest tests.test_tool_call_bridge.ToolCallBridgeTests.test_openai_non_stream_bridges_tool_calls
|
|
```
|
|
|
|
### Smoke-check running gateway
|
|
```bash
|
|
API_KEY=$(grep '^API_KEYS=' .env | cut -d= -f2 | cut -d, -f1)
|
|
curl -s http://127.0.0.1:8317/healthz
|
|
curl -s http://127.0.0.1:8317/v1/models -H "Authorization: Bearer $API_KEY"
|
|
```
|
|
|
|
### Linting/type-checking status
|
|
- There is currently no repo-configured lint/type command (no `ruff`/`flake8`/`mypy` config found).
|
|
- Do not invent tooling commands; if linting is needed, add tooling in a dedicated change first.
|
|
|
|
## Architecture (big picture)
|
|
|
|
### What this service is
|
|
A FastAPI gateway that fronts Lingma and exposes:
|
|
- OpenAI-compatible API (`/v1/models`, `/v1/chat/completions`)
|
|
- Anthropic Messages-compatible API (`/v1/messages`, `/v1/messages/count_tokens`)
|
|
|
|
Both protocols share the same backend pool, backpressure guard, stats, and session reuse logic.
|
|
|
|
### Request lifecycle (important for most changes)
|
|
1. Authenticate request (`app/auth.py`)
|
|
2. Normalize inbound protocol payload to internal message shape (`openai_schema.py` / `anthropic_schema.py`)
|
|
3. Session-cache lookup (`app/session_cache.py`) for prefix-based reuse
|
|
4. Pick backend instance (`app/lingma_pool.py`) with affinity + least-in-flight
|
|
5. Acquire concurrency ticket (`app/concurrency.py`)
|
|
6. Call Lingma via websocket/LSP client (`app/lingma_client.py`)
|
|
7. Map upstream result/stream back to wire protocol in `app/main.py`
|
|
8. Record stats and release ticket (including stream-finally paths)
|
|
|
|
### Core module boundaries
|
|
- `app/main.py`: API entrypoint + orchestration + wire-format adapters
|
|
- `app/lingma_pool.py`: multi-instance lifecycle, selection, health-aware fallback
|
|
- `app/lingma_client.py`: subprocess + LSP-over-WebSocket transport to Lingma
|
|
- `app/session_cache.py`: LRU+TTL cache of conversation-prefix -> upstream session id (+ instance binding)
|
|
- `app/concurrency.py`: in-flight guard and queue timeout/backpressure behavior
|
|
- `app/stats.py`: usage counters and Prometheus text
|
|
|
|
### Protocol-specific notes
|
|
- Anthropic and OpenAI endpoints are separate adapters over shared internals.
|
|
- Response-side tool bridge is implemented: upstream Lingma tool events are surfaced as:
|
|
- OpenAI: `tool_calls` (stream + non-stream)
|
|
- Anthropic: `tool_use` / `tool_result` blocks (stream + non-stream)
|
|
- Request-side `tools` / `tool_choice` are accepted by schemas but not forwarded to Lingma.
|
|
|
|
### Operational invariants to preserve
|
|
- One request must stay on one Lingma instance for session continuity.
|
|
- Session cache entries include instance identity; invalidate on unhealthy instance mismatch.
|
|
- Streaming paths must always release in-flight tickets in `finally`.
|
|
- Multi-instance mode must use isolated workdirs per instance.
|
|
|
|
### Deployment/runtime model
|
|
- Container startup runs `python /app/app/bootstrap_lingma.py` before uvicorn.
|
|
- Compose mounts:
|
|
- `./data -> /app/data` (persistent Lingma binary/cache/workdirs)
|
|
- `./secrets -> /secrets:ro` (session bundles, secrets)
|
|
|
|
|
|
# CLAUDE.md
|
|
|
|
Behavioral guidelines to reduce common LLM coding mistakes. Merge with project-specific instructions as needed.
|
|
|
|
**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment.
|
|
|
|
## 1. Think Before Coding
|
|
|
|
**Don't assume. Don't hide confusion. Surface tradeoffs.**
|
|
|
|
Before implementing:
|
|
- State your assumptions explicitly. If uncertain, ask.
|
|
- If multiple interpretations exist, present them - don't pick silently.
|
|
- If a simpler approach exists, say so. Push back when warranted.
|
|
- If something is unclear, stop. Name what's confusing. Ask.
|
|
|
|
## 2. Simplicity First
|
|
|
|
**Minimum code that solves the problem. Nothing speculative.**
|
|
|
|
- No features beyond what was asked.
|
|
- No abstractions for single-use code.
|
|
- No "flexibility" or "configurability" that wasn't requested.
|
|
- No error handling for impossible scenarios.
|
|
- If you write 200 lines and it could be 50, rewrite it.
|
|
|
|
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
|
|
|
|
## 3. Surgical Changes
|
|
|
|
**Touch only what you must. Clean up only your own mess.**
|
|
|
|
When editing existing code:
|
|
- Don't "improve" adjacent code, comments, or formatting.
|
|
- Don't refactor things that aren't broken.
|
|
- Match existing style, even if you'd do it differently.
|
|
- If you notice unrelated dead code, mention it - don't delete it.
|
|
|
|
When your changes create orphans:
|
|
- Remove imports/variables/functions that YOUR changes made unused.
|
|
- Don't remove pre-existing dead code unless asked.
|
|
|
|
The test: Every changed line should trace directly to the user's request.
|
|
|
|
## 4. Goal-Driven Execution
|
|
|
|
**Define success criteria. Loop until verified.**
|
|
|
|
Transform tasks into verifiable goals:
|
|
- "Add validation" → "Write tests for invalid inputs, then make them pass"
|
|
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
|
|
- "Refactor X" → "Ensure tests pass before and after"
|
|
|
|
For multi-step tasks, state a brief plan:
|
|
```
|
|
1. [Step] → verify: [check]
|
|
2. [Step] → verify: [check]
|
|
3. [Step] → verify: [check]
|
|
```
|
|
|
|
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
|
|
|
|
---
|
|
|
|
**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
|
|
|
|
# CLAUDE.md
|
|
|
|
Behavioral guidelines to reduce common LLM coding mistakes. Merge with project-specific instructions as needed.
|
|
|
|
**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment.
|
|
|
|
## 1. Think Before Coding
|
|
|
|
**Don't assume. Don't hide confusion. Surface tradeoffs.**
|
|
|
|
Before implementing:
|
|
- State your assumptions explicitly. If uncertain, ask.
|
|
- If multiple interpretations exist, present them - don't pick silently.
|
|
- If a simpler approach exists, say so. Push back when warranted.
|
|
- If something is unclear, stop. Name what's confusing. Ask.
|
|
|
|
## 2. Simplicity First
|
|
|
|
**Minimum code that solves the problem. Nothing speculative.**
|
|
|
|
- No features beyond what was asked.
|
|
- No abstractions for single-use code.
|
|
- No "flexibility" or "configurability" that wasn't requested.
|
|
- No error handling for impossible scenarios.
|
|
- If you write 200 lines and it could be 50, rewrite it.
|
|
|
|
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
|
|
|
|
## 3. Surgical Changes
|
|
|
|
**Touch only what you must. Clean up only your own mess.**
|
|
|
|
When editing existing code:
|
|
- Don't "improve" adjacent code, comments, or formatting.
|
|
- Don't refactor things that aren't broken.
|
|
- Match existing style, even if you'd do it differently.
|
|
- If you notice unrelated dead code, mention it - don't delete it.
|
|
|
|
When your changes create orphans:
|
|
- Remove imports/variables/functions that YOUR changes made unused.
|
|
- Don't remove pre-existing dead code unless asked.
|
|
|
|
The test: Every changed line should trace directly to the user's request.
|
|
|
|
## 4. Goal-Driven Execution
|
|
|
|
**Define success criteria. Loop until verified.**
|
|
|
|
Transform tasks into verifiable goals:
|
|
- "Add validation" → "Write tests for invalid inputs, then make them pass"
|
|
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
|
|
- "Refactor X" → "Ensure tests pass before and after"
|
|
|
|
For multi-step tasks, state a brief plan:
|
|
```
|
|
1. [Step] → verify: [check]
|
|
2. [Step] → verify: [check]
|
|
3. [Step] → verify: [check]
|
|
```
|
|
|
|
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
|
|
|
|
---
|
|
|
|
**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
|
|
|
|
<!-- gitnexus:start -->
|
|
# GitNexus — Code Intelligence
|
|
|
|
This project is indexed by GitNexus as **lingma-openai-gateway** (1093 symbols, 2685 relationships, 97 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely.
|
|
|
|
> If any GitNexus tool warns the index is stale, run `npx gitnexus analyze` in terminal first.
|
|
|
|
## Always Do
|
|
|
|
- **MUST run impact analysis before editing any symbol.** Before modifying a function, class, or method, run `gitnexus_impact({target: "symbolName", direction: "upstream"})` and report the blast radius (direct callers, affected processes, risk level) to the user.
|
|
- **MUST run `gitnexus_detect_changes()` before committing** to verify your changes only affect expected symbols and execution flows.
|
|
- **MUST warn the user** if impact analysis returns HIGH or CRITICAL risk before proceeding with edits.
|
|
- When exploring unfamiliar code, use `gitnexus_query({query: "concept"})` to find execution flows instead of grepping. It returns process-grouped results ranked by relevance.
|
|
- When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use `gitnexus_context({name: "symbolName"})`.
|
|
|
|
## Never Do
|
|
|
|
- NEVER edit a function, class, or method without first running `gitnexus_impact` on it.
|
|
- NEVER ignore HIGH or CRITICAL risk warnings from impact analysis.
|
|
- NEVER rename symbols with find-and-replace — use `gitnexus_rename` which understands the call graph.
|
|
- NEVER commit changes without running `gitnexus_detect_changes()` to check affected scope.
|
|
|
|
## Resources
|
|
|
|
| Resource | Use for |
|
|
|----------|---------|
|
|
| `gitnexus://repo/lingma-openai-gateway/context` | Codebase overview, check index freshness |
|
|
| `gitnexus://repo/lingma-openai-gateway/clusters` | All functional areas |
|
|
| `gitnexus://repo/lingma-openai-gateway/processes` | All execution flows |
|
|
| `gitnexus://repo/lingma-openai-gateway/process/{name}` | Step-by-step execution trace |
|
|
|
|
## CLI
|
|
|
|
| Task | Read this skill file |
|
|
|------|---------------------|
|
|
| Understand architecture / "How does X work?" | `.claude/skills/gitnexus/gitnexus-exploring/SKILL.md` |
|
|
| Blast radius / "What breaks if I change X?" | `.claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md` |
|
|
| Trace bugs / "Why is X failing?" | `.claude/skills/gitnexus/gitnexus-debugging/SKILL.md` |
|
|
| Rename / extract / split / refactor | `.claude/skills/gitnexus/gitnexus-refactoring/SKILL.md` |
|
|
| Tools, resources, schema reference | `.claude/skills/gitnexus/gitnexus-guide/SKILL.md` |
|
|
| Index, status, clean, wiki CLI commands | `.claude/skills/gitnexus/gitnexus-cli/SKILL.md` |
|
|
|
|
<!-- gitnexus:end -->
|