prod hardening: admin/metrics authz split, subprocess lifecycle, parallel pool start, HEALTHCHECK
- authz: new ADMIN_TOKEN gates /internal/*; METRICS_PUBLIC=false by default, so /metrics returns 503 when neither METRICS_TOKEN nor API_KEYS is set (previously leaked pool topology). Startup logs loudly if API_KEYS is empty or admin falls back to chat keys. - lingma_client: keep a Popen handle instead of orphaning Lingma with start_new_session, drain stderr to logger at DEBUG, SIGTERM -> 5s grace -> SIGKILL on shutdown. Fixes the zombie-process leak on container reload. - pool: asyncio.gather to start N instances concurrently; N=2 pool shaves ~startup_timeout seconds off boot. - Dockerfile: HEALTHCHECK hits /healthz and greps for pool_ready>0 so Docker / compose orchestrators see "stuck on login" as unhealthy. Made-with: Cursor
This commit is contained in:
@@ -22,6 +22,8 @@ class Settings:
|
||||
port: int
|
||||
api_keys: list[str]
|
||||
metrics_token: str
|
||||
admin_token: str
|
||||
metrics_public: bool
|
||||
log_level: str
|
||||
gateway_max_in_flight: int
|
||||
gateway_queue_timeout_sec: float
|
||||
@@ -151,6 +153,8 @@ def load_settings() -> Settings:
|
||||
port=int(os.getenv("PORT", "8317")),
|
||||
api_keys=api_keys,
|
||||
metrics_token=os.getenv("METRICS_TOKEN", "").strip(),
|
||||
admin_token=os.getenv("ADMIN_TOKEN", "").strip(),
|
||||
metrics_public=_bool_env("METRICS_PUBLIC", False),
|
||||
log_level=os.getenv("LOG_LEVEL", "INFO").strip() or "INFO",
|
||||
gateway_max_in_flight=int(os.getenv("GATEWAY_MAX_IN_FLIGHT", "4")),
|
||||
gateway_queue_timeout_sec=float(os.getenv("GATEWAY_QUEUE_TIMEOUT_SEC", "30")),
|
||||
|
||||
Reference in New Issue
Block a user