prod hardening: admin/metrics authz split, subprocess lifecycle, parallel pool start, HEALTHCHECK

- authz: new ADMIN_TOKEN gates /internal/*; METRICS_PUBLIC=false by default, so
  /metrics returns 503 when neither METRICS_TOKEN nor API_KEYS is set
  (previously leaked pool topology). Startup logs loudly if API_KEYS is empty
  or admin falls back to chat keys.
- lingma_client: keep a Popen handle instead of orphaning Lingma with
  start_new_session, drain stderr to logger at DEBUG, SIGTERM -> 5s grace ->
  SIGKILL on shutdown. Fixes the zombie-process leak on container reload.
- pool: asyncio.gather to start N instances concurrently; N=2 pool shaves
  ~startup_timeout seconds off boot.
- Dockerfile: HEALTHCHECK hits /healthz and greps for pool_ready>0 so Docker
  / compose orchestrators see "stuck on login" as unhealthy.

Made-with: Cursor
This commit is contained in:
GitHub Actions
2026-04-18 10:22:13 +08:00
parent 3130533888
commit 2febc37c2c
8 changed files with 248 additions and 28 deletions

View File

@@ -2,10 +2,14 @@
HOST=0.0.0.0
# 网关监听端口
PORT=8317
# API Key可配置多个逗号分隔
# API Key可配置多个逗号分隔。空 = 不鉴权(启动会打 warning仅用于本地 dev
API_KEYS=sk-your-api-key
# 独立的 /metrics 鉴权 token留空则退化为 API_KEYS 可访问;若 API_KEYS 都没配/metrics 为公开
# 独立的 /metrics 鉴权 token留空则退化为 API_KEYS 可访问;若 API_KEYS 同时为空/metrics 默认 503
METRICS_TOKEN=
# 显式把 /metrics 设为公开(仅在私网采集器场景使用)
METRICS_PUBLIC=false
# 独立的 /internal/* 管理 token留空则退化为 API_KEYS强烈建议生产环境单独配置
ADMIN_TOKEN=
# 日志级别DEBUG / INFO / WARNING / ERROR
LOG_LEVEL=INFO