prod hardening: admin/metrics authz split, subprocess lifecycle, parallel pool start, HEALTHCHECK
- authz: new ADMIN_TOKEN gates /internal/*; METRICS_PUBLIC=false by default, so /metrics returns 503 when neither METRICS_TOKEN nor API_KEYS is set (previously leaked pool topology). Startup logs loudly if API_KEYS is empty or admin falls back to chat keys. - lingma_client: keep a Popen handle instead of orphaning Lingma with start_new_session, drain stderr to logger at DEBUG, SIGTERM -> 5s grace -> SIGKILL on shutdown. Fixes the zombie-process leak on container reload. - pool: asyncio.gather to start N instances concurrently; N=2 pool shaves ~startup_timeout seconds off boot. - Dockerfile: HEALTHCHECK hits /healthz and greps for pool_ready>0 so Docker / compose orchestrators see "stuck on login" as unhealthy. Made-with: Cursor
This commit is contained in:
11
Dockerfile
11
Dockerfile
@@ -17,4 +17,15 @@ COPY app /app/app
|
||||
|
||||
EXPOSE 8317
|
||||
|
||||
# Container-level health signal. Docker Compose / orchestrators rely on this
|
||||
# to stop sending traffic when the pool is wedged, restart unhealthy replicas,
|
||||
# and drive rolling deploys. /healthz returns ok=true only when at least one
|
||||
# Lingma instance is in state=ready, so it catches the "stuck on login" case
|
||||
# that a raw TCP probe would miss.
|
||||
HEALTHCHECK --interval=30s --timeout=5s --start-period=60s --retries=3 \
|
||||
CMD python -c "import os,json,urllib.request,sys; \
|
||||
port=os.environ.get('PORT','8317'); \
|
||||
r=urllib.request.urlopen(f'http://127.0.0.1:{port}/healthz', timeout=3); \
|
||||
sys.exit(0 if json.load(r).get('ok') else 1)" || exit 1
|
||||
|
||||
CMD ["sh", "-c", "python /app/app/bootstrap_lingma.py && uvicorn app.main:app --host ${HOST:-0.0.0.0} --port ${PORT:-8317}"]
|
||||
|
||||
Reference in New Issue
Block a user