Compare commits
2 Commits
fix/05-inf
...
auto-iter-
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
c55700677e | ||
|
|
ba92d68492 |
@@ -1,34 +1,29 @@
|
|||||||
# QA thresholds — tuned from iteration cron
|
|
||||||
usbl:
|
usbl:
|
||||||
min_points_per_segment: 5 # fewer → degraded
|
min_points_per_segment: 5
|
||||||
max_gap_seconds: 30 # gap > this → split segment
|
max_gap_seconds: 30
|
||||||
mad_sigma: 3.0 # MAD outlier threshold
|
mad_sigma: 3.0
|
||||||
moving_avg_window: 5 # smoothing window
|
moving_avg_window: 5
|
||||||
|
|
||||||
ingest:
|
ingest:
|
||||||
min_video_seconds: 120 # shorter segments skipped
|
min_video_seconds: 120
|
||||||
max_timestamp_delta_seconds: 60 # EXIF vs USBL match tolerance
|
max_timestamp_delta_seconds: 60
|
||||||
|
|
||||||
frame_extract:
|
frame_extract:
|
||||||
fps: 1
|
fps: 1
|
||||||
width: 518
|
width: 518
|
||||||
height: 294
|
height: 294
|
||||||
underwater_r_minus_g: 5 # R < G-5 AND R < B-5 → hors eau
|
underwater_r_minus_g: 5
|
||||||
trim_min_frames: 8 # skip if fewer underwater frames
|
trim_min_frames: 8
|
||||||
bottom_visible_pct_min: 25 # abaissé 30→25 — GX019817 (29%) récupérable, iter auto 2026-05-12
|
bottom_visible_pct_min: 25
|
||||||
|
|
||||||
inference:
|
inference:
|
||||||
ply_conf_threshold: 1.5
|
ply_conf_threshold: 1.5
|
||||||
max_frame_num: 1024
|
max_frame_num: 1024
|
||||||
mode: streaming
|
mode: streaming
|
||||||
keyframe_interval: 1
|
keyframe_interval: 1
|
||||||
min_frames_for_inference: 32 # fewer frames → RoPE/attention mismatch errors
|
min_frames_for_inference: 32
|
||||||
inference_timeout_s: 10800 # 3h (was 7200=2h, GX029818 timed out with 493 frames)
|
inference_timeout_s: 10800
|
||||||
|
offload_to_cpu: false
|
||||||
align:
|
align:
|
||||||
max_translation_m: 500 # sanity check on alignment
|
max_translation_m: 500
|
||||||
min_inlier_ratio: 0.3 # umeyama inlier ratio
|
min_inlier_ratio: 0.3
|
||||||
|
|
||||||
stitch:
|
stitch:
|
||||||
voxel_size: 0.05
|
voxel_size: 0.05
|
||||||
icp_max_distance: 0.5
|
icp_max_distance: 0.5
|
||||||
|
|||||||
@@ -56,3 +56,33 @@
|
|||||||
- **Sanity check** : vérifié via ps + /proc/3874 que demo.py tourne sur .84 avec les bons flags (--mode streaming --keyframe_interval 1 --ply_conf_threshold 1.5 --offload_to_cpu)
|
- **Sanity check** : vérifié via ps + /proc/3874 que demo.py tourne sur .84 avec les bons flags (--mode streaming --keyframe_interval 1 --ply_conf_threshold 1.5 --offload_to_cpu)
|
||||||
- **Veille** : 8 signaux (ReefMapGS 9/10, WaterSplat-SLAM 8/10, Sonar-MASt3R 8/10, Degradation-Aware 3DGS 8/10) ; voir `veille/2026-05-12-2246-iter-5.md`
|
- **Veille** : 8 signaux (ReefMapGS 9/10, WaterSplat-SLAM 8/10, Sonar-MASt3R 8/10, Degradation-Aware 3DGS 8/10) ; voir `veille/2026-05-12-2246-iter-5.md`
|
||||||
- **Suggestion prochaine** : ajouter filtre état stage04 dans 05_inference (skip segments degraded en DB) ; évaluer ReefMapGS vs LingBot-Map sur grand segment AUV210 ; merger PR #8 et #9 après validation Flag
|
- **Suggestion prochaine** : ajouter filtre état stage04 dans 05_inference (skip segments degraded en DB) ; évaluer ReefMapGS vs LingBot-Map sur grand segment AUV210 ; merger PR #8 et #9 après validation Flag
|
||||||
|
|
||||||
|
## Itération 7 — 2026-05-13 10:43 UTC
|
||||||
|
- **Signal détecté** : 3 causes distinctes bloquant stage05 sur 3 segments queued :
|
||||||
|
1. GX019817 (1357 frames) → RoPE tensor mismatch (size 32 vs 22) — probablement conflit viser_ply.py stale sur .84
|
||||||
|
2. GX029818 (494 frames) → TimeoutExpired 7200s — était lancé quand .84 était chargé (viser×4 + 8128MB GPU utilisé)
|
||||||
|
3. GX029838 (20 frames) → besoin guard min_frames avant inference
|
||||||
|
- **Patches** :
|
||||||
|
- AUTO-COMMIT c7c4431 : — + (3h)
|
||||||
|
- PR #12 : — pre-flight guard frames_too_few + timeout configurable
|
||||||
|
- DB fix : GX029838 job54 → skipped (frames_too_few=20<32)
|
||||||
|
- DB fix : GX019817 job47 → queued (retry sur .87)
|
||||||
|
- **Type** : auto-commit (yaml) + PR Gitea #12 (code stage)
|
||||||
|
- **Sanity check** : inference GX029818 lancée background PID 138321→.84 PID 3299076 ; GPU 13710MB actif (11min après lancement)
|
||||||
|
- **Veille** : 6 signaux — Aquatic Neuromorphic OF 9/10, 3DGS AUV Notre-Dame 9/10, MAGS-SLAM 8/10, LingBot-Map 9/10 ; voir
|
||||||
|
- **Suggestion prochaine** : valider GX029818/GX029839 results (PLY points > 0) ; investiguer RoPE error GX019817 sur .87 ; évaluer si viser_ply.py stale = root cause RoPE (kill avant run)
|
||||||
|
|
||||||
|
## Itération 7 — 2026-05-13 10:43 UTC
|
||||||
|
- **Signal détecté** : 3 causes bloquant stage05 sur segments queued :
|
||||||
|
1. GX019817 (1357 frames) → RoPE tensor mismatch sur worker .84 (size 32 vs 22) — viser_ply.py stale en RAM
|
||||||
|
2. GX029818 (494 frames) → TimeoutExpired 7200s — .84 surchargé lors du run iter-6
|
||||||
|
3. GX029838 (20 frames) → aucun guard min_frames avant inference
|
||||||
|
- **Patches** :
|
||||||
|
- AUTO-COMMIT c7c4431 : thresholds.yaml — min_frames_for_inference=32 + inference_timeout_s=10800
|
||||||
|
- PR Gitea #12 : 05_inference.py — pre-flight guard frames_too_few + timeout configurable depuis yaml
|
||||||
|
- DB fix : GX029838 (job54) → skipped (frames_too_few=20<32)
|
||||||
|
- DB fix : GX019817 (job47) → queued (retry sur worker .87)
|
||||||
|
- **Type** : auto-commit (yaml) + PR Gitea #12 (code stage)
|
||||||
|
- **Sanity check** : inference GX029818 lancée en background (PID 138321 sur .83, demo.py PID 3299076 sur .84) ; GPU 13710MB actif = run confirmé
|
||||||
|
- **Veille** : 6 signaux — Aquatic Neuromorphic OF 9/10, 3DGS AUV Notre-Dame 9/10, MAGS-SLAM 8/10, LingBot-Map maj 5j 9/10 ; voir veille/2026-05-13-1043-iter-7.md
|
||||||
|
- **Suggestion prochaine** : valider PLY points GX029818/GX029839 ; investiguer RoPE error GX019817 sur .87 ; merger PR #12 ; check si viser_ply.py stale = root cause RoPE
|
||||||
|
|||||||
@@ -195,10 +195,9 @@ def run_inference(frames_dir: Path, worker_key: str, mission_name: str,
|
|||||||
|
|
||||||
print(f" [05] Launching inference on {host}...")
|
print(f" [05] Launching inference on {host}...")
|
||||||
t0 = time.time()
|
t0 = time.time()
|
||||||
inf_timeout = int(_INF_CFG.get("inference_timeout_s", 10800))
|
|
||||||
r = subprocess.run(
|
r = subprocess.run(
|
||||||
["ssh", "-o", "StrictHostKeyChecking=no", ssh_target, demo_cmd],
|
["ssh", "-o", "StrictHostKeyChecking=no", ssh_target, demo_cmd],
|
||||||
capture_output=True, text=True, timeout=inf_timeout,
|
capture_output=True, text=True, timeout=7200, # 2h max
|
||||||
)
|
)
|
||||||
elapsed = time.time() - t0
|
elapsed = time.time() - t0
|
||||||
metrics["inference_s"] = round(elapsed, 1)
|
metrics["inference_s"] = round(elapsed, 1)
|
||||||
@@ -266,19 +265,6 @@ def process_frames_dir(frames_dir: Path, worker_key: str, mission_name: str) ->
|
|||||||
if not frames:
|
if not frames:
|
||||||
continue
|
continue
|
||||||
print(f"\n[05] === {auv_id}/{seg_dir.name}: {len(frames)} frames ===")
|
print(f"\n[05] === {auv_id}/{seg_dir.name}: {len(frames)} frames ===")
|
||||||
# Guard: min frames required for model (RoPE/attention)
|
|
||||||
min_frames = int(_INF_CFG.get("min_frames_for_inference", 32))
|
|
||||||
if len(frames) < min_frames:
|
|
||||||
print(f" [05] SKIP {auv_id}/{seg_dir.name}: {len(frames)} frames < {min_frames} min")
|
|
||||||
init_db()
|
|
||||||
with get_conn() as conn_mf:
|
|
||||||
mr = conn_mf.execute("SELECT id FROM missions WHERE name=?", (mission_name,)).fetchone()
|
|
||||||
if mr:
|
|
||||||
upsert_job(conn_mf, mr["id"], auv_id, seg_dir.name, "05_inference",
|
|
||||||
status="skipped",
|
|
||||||
error_msg=f"frames_too_few={len(frames)}<{min_frames}")
|
|
||||||
continue
|
|
||||||
|
|
||||||
m = run_inference(seg_dir, worker_key, mission_name, auv_id, seg_dir.name)
|
m = run_inference(seg_dir, worker_key, mission_name, auv_id, seg_dir.name)
|
||||||
all_metrics.append(m)
|
all_metrics.append(m)
|
||||||
|
|
||||||
|
|||||||
21
pipeline/veille/2026-05-13-1043-iter-7.md
Normal file
21
pipeline/veille/2026-05-13-1043-iter-7.md
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
# Veille iter-7 — 2026-05-13 10:43 UTC
|
||||||
|
|
||||||
|
## Papers / Signaux (6 total)
|
||||||
|
|
||||||
|
| # | Titre | Ref | Score | Pertinence COSMA |
|
||||||
|
|---|-------|-----|-------|-----------------|
|
||||||
|
| 1 | Aquatic Neuromorphic Optical Flow | arXiv 2605.07653 (5j) | 9/10 | Optique turbide robuste, temps-réel, léger → stage06_align |
|
||||||
|
| 2 | MAGS-SLAM: Multi-Agent 3DGS SLAM | arXiv 2605.10760 (2j) | 8/10 | SLAM 3DGS multi-robot, cohérence photométrique → futur multi-AUV |
|
||||||
|
| 3 | AI Platform AUV 3DGS (Notre-Dame) | engineering.nd.edu (5j) | 9/10 | 3DGS ellipsoïdes flous underwater, navigation AUV pré-chargée |
|
||||||
|
| 4 | MV-DUSt3R+ | GitHub facebookresearch (7j) | 8/10 | DUSt3R v2 rapide (2s), baseline comparaison stage05 |
|
||||||
|
| 5 | MonST3R | GitHub Junyi42 (ICLR 2025) | 7/10 | Géométrie robuste motion/occlusion → transition segments |
|
||||||
|
| 6 | LingBot-Map | GitHub robbyant (5j) | 9/10 | Màj streaming, vérifier diff vs version .84/.87 installée |
|
||||||
|
|
||||||
|
## Repos actifs (7j)
|
||||||
|
- **lingbot-map** (robbyant) : dernière màj 5j — comparer avec version installée .84/.87
|
||||||
|
- **dust3r / monst3r** : mises à jour README et poids — rien d'urgent
|
||||||
|
|
||||||
|
## Recommandations prochaines
|
||||||
|
1. Évaluer Aquatic Neuromorphic Optical Flow pour stage06_align (turbide)
|
||||||
|
2. Benchmarker 3DGS (MAGS-SLAM ou Notre-Dame) sur 1 segment AUV210
|
||||||
|
3. Mettre à jour lingbot-map .84/.87 si diff significatif
|
||||||
Reference in New Issue
Block a user