Go to file

Poulpe 13323f2edf fix: 05_inference — kill stale demo.py + background poll exit viser + offload_to_cpu from yaml

- kill_stale_demo_py() before each segment to prevent GPU contention from orphan processes
- Remote script runs demo.py in background via nohup, polls for PLY file every 30s, kills viser server once PLY written — prevents indefinite SSH block on viser listener
- offload_to_cpu now read from thresholds.yaml[inference] (default false for 24GB VRAM)
- timeout reads inference_timeout_s from yaml (already 10800s)
- min_frames guard included (from fix/05-inference-min-frames-timeout)

Root cause: demo.py starts viser server after writing PLY; SSH timed out → orphan; two orphans competed for GPU with offload_to_cpu → pure CPU inference = 6h+ for 493 frames

2026-05-13 16:41:18 +00:00

app

feat: frame QC scoring + viser per-AUV button

2026-05-11 11:05:37 +00:00

docs

feat: migration vers cosma-vm (.83) — dispatcher+dashboard, OpenVPN, docs infra mis à jour

2026-04-24 00:16:21 +00:00

pipeline

fix: 05_inference — kill stale demo.py + background poll exit viser + offload_to_cpu from yaml

2026-05-13 16:41:18 +00:00

scripts

feat: frame QC scoring + viser per-AUV button

2026-05-11 11:05:37 +00:00

tests

feat: stitch.py --poses trajectory_world.h5 — T_init depuis poses monde, remplace RANSAC

2026-04-24 10:27:55 +02:00

.gitignore

feat(pipeline): jalon 1-3 — ingest, USBL parse, filter

2026-05-11 10:25:27 +00:00

CLAUDE.md

docs: create CLAUDE.md with Infrastructure section

2026-04-24 15:56:23 +00:00

CONTEXT.md

dashboard vrai tableau + probe viser_url alive + CSS propre

2026-04-22 22:43:14 +00:00

docker-compose.yml

feat: pipeline monitor + orchestrator stats dashboard

2026-05-11 10:55:44 +00:00

Dockerfile

fix: COPY docs/_build/html dans image Docker

2026-04-23 23:59:43 +00:00

pyproject.toml

scaffold — FastAPI + SQLite + HTMX dashboard, ingest + dispatcher

2026-04-21 09:52:41 +00:00

README.md

scaffold — FastAPI + SQLite + HTMX dashboard, ingest + dispatcher

2026-04-21 09:52:41 +00:00

README.md

cosma-qc

COSMA post-acquisition QC pipeline — reconstruction photogrammétrique par GoPro (lingbot-map), queue de jobs distribués, dashboard web pour suivi terrain le jour même.

Objectif

Après une acquisition AUV (2 GoPros × 2-3 AUVs × heures d'enregistrement), savoir rapidement si la couverture est complète avant de replier la mission — sans attendre les 30 jours du traitement photogrammétrique complet.

Pipeline

SSD plugged ─┐
             ├─▶ Ingestion ─▶ Frame extraction (per GoPro × segment)
             │                        │
             │                        ▼
             │                  Job queue (SQLite)
             │                        │
             │         ┌──────────────┼──────────────┐
             ▼         ▼              ▼              ▼
         Dashboard  Worker .87    Worker .84    (scalable)
         (FastAPI)   (3060)       (3090)
             │         │              │
             │         └─▶ PLY ◀──────┘
             │              │
             │              ▼
             └──────── ICP stitch (Open3D) ─▶ viser viewer

Stack

Backend : FastAPI + SQLite
Frontend : HTMX (UI réactive sans build JS)
Queue : table SQLite + workers SSH-triggered
Monitoring : polling nvidia-smi sur .87 / .84, df pour disque
Reconstruction : lingbot-map (GCT-Stream windowed)
Stitch : Open3D ICP

Déploiement

Service sur .82 (stable, Caddy pour URL propre)
Workers : SSH vers .87 (3060 12 GB) et .84 (3090 24 GB)

État

Scaffold en cours.

README.md Unescape Escape

cosma-qc

Objectif

Pipeline

Stack

Déploiement

État

README.md