feat: migration vers cosma-vm (.83) — dispatcher+dashboard, OpenVPN, docs infra mis à jour

This commit is contained in:
Flag
2026-04-24 00:16:21 +00:00
parent c765e8cc40
commit 4eb9f22813
8 changed files with 235 additions and 274 deletions

View File

@@ -47,11 +47,11 @@
<li class="toctree-l1"><a class="reference internal" href="pipeline.html">Pipeline cosma-qc</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">Infrastructure</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#reseau-lan-192-168-0-0-24">Réseau LAN — 192.168.0.0/24</a></li>
<li class="toctree-l2"><a class="reference internal" href="#noeud-cosma-vm-83-serveur-principal-cosma-qc">Nœud cosma-vm (.83) — serveur principal cosma-qc</a></li>
<li class="toctree-l2"><a class="reference internal" href="#noeud-core-82">Nœud core (.82)</a></li>
<li class="toctree-l2"><a class="reference internal" href="#noeuds-gpu-workers-84-et-87">Nœuds GPU workers (.84 et .87)</a></li>
<li class="toctree-l2"><a class="reference internal" href="#noeud-z620-168">Nœud z620 (.168)</a></li>
<li class="toctree-l2"><a class="reference internal" href="#service-systemd-dispatcher">Service systemd dispatcher</a></li>
<li class="toctree-l2"><a class="reference internal" href="#conteneur-docker-dashboard">Conteneur Docker dashboard</a></li>
<li class="toctree-l2"><a class="reference internal" href="#noeud-z620-168-stockage-mp4">Nœud z620 (.168) — stockage MP4</a></li>
<li class="toctree-l2"><a class="reference internal" href="#openvpn-acces-collegues">OpenVPN — accès collègues</a></li>
<li class="toctree-l2"><a class="reference internal" href="#ports-reseau-recapitulatifs">Ports réseau récapitulatifs</a></li>
</ul>
</li>
@@ -87,44 +87,62 @@
<h1>Infrastructure<a class="headerlink" href="#infrastructure" title="Link to this heading"></a></h1>
<section id="reseau-lan-192-168-0-0-24">
<h2>Réseau LAN — 192.168.0.0/24<a class="headerlink" href="#reseau-lan-192-168-0-0-24" title="Link to this heading"></a></h2>
<div class="highlight-text notranslate"><div class="highlight"><pre><span></span>┌─────────────────────────────────────────────────────────┐
│ LAN 192.168.0.0/24 │
│ │
│ .82 CORE Dispatcher (systemd) + FastAPI :3849
Gitea + Grafana + InfluxDB + Caddy
│ │
│ .84 ml-stack GPU worker RTX 3090 24GB
.87 gpu GPU worker RTX 3060 12GB
.168 z620 Proxmox host HP Z620
SSD → /mnt/portablessd (MP4 bruts)
└─────────────────────────────────────────────────────────┘
<div class="highlight-text notranslate"><div class="highlight"><pre><span></span>┌───────────────────────────────────────────────────────────────
LAN 192.168.0.0/24
│ .82 CORE Gitea + Grafana + InfluxDB + Caddy
Reverse-proxy → cosma-vm pour /cosma-qc
│ .83 cosma-vm Dispatcher cosma-qc (systemd)
Dashboard FastAPI :3849
OpenVPN server :1194 UDP
.84 ml-stack GPU worker RTX 3090 24GB
│ .87 gpu GPU worker RTX 3060 12GB │
│ │
│ .168 z620 Proxmox host HP Z620 — SSD MP4 bruts │
│ .101 poxML Proxmox hyperviseur — héberge cosma-vm (VM 202)│
└───────────────────────────────────────────────────────────────┘
</pre></div>
</div>
</section>
<section id="noeud-cosma-vm-83-serveur-principal-cosma-qc">
<h2>Nœud cosma-vm (.83) — serveur principal cosma-qc<a class="headerlink" href="#noeud-cosma-vm-83-serveur-principal-cosma-qc" title="Link to this heading"></a></h2>
<p>VM #202 sur Proxmox poxML (.101). 4 vCPU, 4 GB RAM, 40 GB SSD.
User : <code class="docutils literal notranslate"><span class="pre">cosma</span></code> (avec docker group et sudo).</p>
<p>Services :</p>
<ul class="simple">
<li><p><strong>Dispatcher</strong><code class="docutils literal notranslate"><span class="pre">cosma-qc-dispatcher.service</span></code> (systemd)
Lance les extractions ffmpeg sur les workers GPU.</p></li>
<li><p><strong>Dashboard FastAPI</strong> — conteneur Docker <code class="docutils literal notranslate"><span class="pre">cosma-qc-app</span></code>, port <strong>3849</strong>.
Accès via Caddy sur core : <code class="docutils literal notranslate"><span class="pre">http://192.168.0.82/cosma-qc</span></code></p></li>
<li><p><strong>OpenVPN</strong><code class="docutils literal notranslate"><span class="pre">openvpn-server&#64;cosma.service</span></code>, port 1194 UDP.
Permet aux collègues daccéder au LAN depuis lextérieur.</p></li>
</ul>
<p>Commandes utiles :</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="c1"># SSH vers cosma-vm</span>
ssh<span class="w"> </span>cosma@192.168.0.83
<span class="c1"># Statut dispatcher</span>
sudo<span class="w"> </span>systemctl<span class="w"> </span>status<span class="w"> </span>cosma-qc-dispatcher
<span class="c1"># Logs dispatcher</span>
tail<span class="w"> </span>-f<span class="w"> </span>/home/cosma/cosma-qc-data/dispatcher.log
<span class="c1"># Dashboard local</span>
http://192.168.0.83:3849
</pre></div>
</div>
</section>
<section id="noeud-core-82">
<h2>Nœud core (.82)<a class="headerlink" href="#noeud-core-82" title="Link to this heading"></a></h2>
<p><strong>Rôle :</strong> orchestrateur central du pipeline.</p>
<p><strong>Rôle :</strong> routeur et services communs.</p>
<p>Services actifs :</p>
<ul class="simple">
<li><p><strong>Dispatcher</strong>service systemd cosma-qc-dispatcher.
Boucle principale qui dispatch les jobs aux workers GPU.</p></li>
<li><p><strong>Dashboard FastAPI</strong> — conteneur Docker exposé sur le port <strong>3849</strong>.
Interface web de monitoring des jobs.</p></li>
<li><p><strong>Gitea</strong> — dépôt source floppyrj45/cosma-qc.</p></li>
<li><p><strong>Caddy</strong>reverse proxy HTTPS, route <code class="docutils literal notranslate"><span class="pre">/cosma-qc</span></code> vers .83:3849.</p></li>
<li><p><strong>Gitea</strong> — dépôt source <code class="docutils literal notranslate"><span class="pre">floppyrj45/cosma-qc</span></code>.</p></li>
<li><p><strong>Grafana / InfluxDB</strong> — monitoring infrastructure.</p></li>
</ul>
<p>Commandes utiles :</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="c1"># Statut dispatcher</span>
sudo<span class="w"> </span>systemctl<span class="w"> </span>status<span class="w"> </span>cosma-qc-dispatcher
<span class="c1"># Logs dispatcher temps réel</span>
sudo<span class="w"> </span>journalctl<span class="w"> </span>-u<span class="w"> </span>cosma-qc-dispatcher<span class="w"> </span>-f
<span class="c1"># Dashboard</span>
http://192.168.0.82:3849
</pre></div>
</div>
</section>
<section id="noeuds-gpu-workers-84-et-87">
<h2>Nœuds GPU workers (.84 et .87)<a class="headerlink" href="#noeuds-gpu-workers-84-et-87" title="Link to this heading"></a></h2>
@@ -137,7 +155,7 @@ http://192.168.0.82:3849
</colgroup>
<thead>
<tr class="row-odd"><th class="head"><p>IP</p></th>
<th class="head"><p>Nom</p></th>
<th class="head"><p>Nom SSH</p></th>
<th class="head"><p>GPU</p></th>
<th class="head"><p>VRAM</p></th>
</tr>
@@ -155,69 +173,36 @@ http://192.168.0.82:3849
</tr>
</tbody>
</table>
<p><strong>Rôle :</strong> exécution de ffmpeg (extraction frames) et lingbot-map (reconstruction 3D).</p>
<p>Répertoire de travail sur chaque worker :</p>
<p><strong>Rôle :</strong> exécution de ffmpeg et lingbot-map.</p>
<div class="highlight-text notranslate"><div class="highlight"><pre><span></span>/cosma-qc-frames/
── job_1/
├── frame_000001.jpg … frame_NNNNNN.jpg
├── .video_0.done
├── reconstruction.ply
└── reconstruction.glb (généré à la demande)
├── job_2/
│ └── …
└── stitch_1.ply
── job_N/
├── frame_NNNNNN.jpg
├── .video_V.done
├── reconstruction.ply
└── reconstruction.glb
</pre></div>
</div>
</section>
<section id="noeud-z620-168">
<h2>Nœud z620 (.168)<a class="headerlink" href="#noeud-z620-168" title="Link to this heading"></a></h2>
<p><strong>Rôle :</strong> stockage des MP4 bruts GoPro.</p>
<ul class="simple">
<li><p>Proxmox host HP Z620.</p></li>
<li><p>SSD monté sur /mnt/portablessd.</p></li>
<li><p>Les MP4 <strong>ne quittent jamais</strong> z620 — ffmpeg sy exécute via SSH.</p></li>
</ul>
<p>Accès SSH depuis core :</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>ssh<span class="w"> </span>floppyrj45@192.168.0.168
<section id="noeud-z620-168-stockage-mp4">
<h2>Nœud z620 (.168) — stockage MP4<a class="headerlink" href="#noeud-z620-168-stockage-mp4" title="Link to this heading"></a></h2>
<p>HP Z620 (Proxmox). SSD /mnt/portablessd avec les MP4 GoPro bruts.
Les MP4 <strong>ne quittent jamais</strong> z620 — ffmpeg sy exécute via SSH-relay.</p>
<p>SSH relay (depuis cosma-vm ou core) :</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>ssh<span class="w"> </span>ml-stack<span class="w"> </span><span class="s2">&quot;ssh z620 \\&quot;</span>commande<span class="se">\\</span><span class="s2">&quot;&quot;</span>
</pre></div>
</div>
</section>
<section id="service-systemd-dispatcher">
<h2>Service systemd dispatcher<a class="headerlink" href="#service-systemd-dispatcher" title="Link to this heading"></a></h2>
<p>Fichier de service : /etc/systemd/system/cosma-qc-dispatcher.service</p>
<div class="highlight-ini notranslate"><div class="highlight"><pre><span></span><span class="k">[Unit]</span>
<span class="na">Description</span><span class="o">=</span><span class="s">COSMA QC Dispatcher</span>
<span class="na">After</span><span class="o">=</span><span class="s">network.target</span>
<span class="k">[Service]</span>
<span class="na">User</span><span class="o">=</span><span class="s">floppyrj45</span>
<span class="na">WorkingDirectory</span><span class="o">=</span><span class="s">/home/floppyrj45/docker/cosma-qc</span>
<span class="na">ExecStart</span><span class="o">=</span><span class="s">/usr/bin/python3 app/dispatcher.py</span>
<span class="na">Restart</span><span class="o">=</span><span class="s">on-failure</span>
<span class="na">RestartSec</span><span class="o">=</span><span class="s">10</span>
<span class="k">[Install]</span>
<span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span>
<section id="openvpn-acces-collegues">
<h2>OpenVPN — accès collègues<a class="headerlink" href="#openvpn-acces-collegues" title="Link to this heading"></a></h2>
<p>Serveur OpenVPN sur cosma-vm (.83:1194 UDP).
Adresse externe : <code class="docutils literal notranslate"><span class="pre">laboratoire.freeboxos.fr:1194</span></code></p>
<p>Profil client : <code class="docutils literal notranslate"><span class="pre">cosma-qc-collegue1.ovpn</span></code> (Nextcloud sync).</p>
<p>Génération dun nouveau client :</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="nb">cd</span><span class="w"> </span>/home/cosma/openvpn-ca/easyrsa3
./easyrsa<span class="w"> </span>gen-req<span class="w"> </span>NOMCLIENT<span class="w"> </span>nopass
./easyrsa<span class="w"> </span>sign-req<span class="w"> </span>client<span class="w"> </span>NOMCLIENT
</pre></div>
</div>
<p>Commandes de gestion :</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>sudo<span class="w"> </span>systemctl<span class="w"> </span>start<span class="w"> </span>cosma-qc-dispatcher
sudo<span class="w"> </span>systemctl<span class="w"> </span>stop<span class="w"> </span>cosma-qc-dispatcher
sudo<span class="w"> </span>systemctl<span class="w"> </span>restart<span class="w"> </span>cosma-qc-dispatcher
sudo<span class="w"> </span>systemctl<span class="w"> </span><span class="nb">enable</span><span class="w"> </span>cosma-qc-dispatcher<span class="w"> </span><span class="c1"># démarrage auto</span>
</pre></div>
</div>
</section>
<section id="conteneur-docker-dashboard">
<h2>Conteneur Docker dashboard<a class="headerlink" href="#conteneur-docker-dashboard" title="Link to this heading"></a></h2>
<p>Le dashboard FastAPI tourne dans un conteneur Docker.</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="nb">cd</span><span class="w"> </span>/home/floppyrj45/docker/cosma-qc
docker<span class="w"> </span>compose<span class="w"> </span>up<span class="w"> </span>-d<span class="w"> </span><span class="c1"># démarrer</span>
docker<span class="w"> </span>compose<span class="w"> </span>down<span class="w"> </span><span class="c1"># arrêter</span>
docker<span class="w"> </span>compose<span class="w"> </span>logs<span class="w"> </span>-f<span class="w"> </span><span class="c1"># logs</span>
</pre></div>
</div>
<p>Accès : <a class="reference external" href="http://192.168.0.82:3849">http://192.168.0.82:3849</a></p>
</section>
<section id="ports-reseau-recapitulatifs">
<h2>Ports réseau récapitulatifs<a class="headerlink" href="#ports-reseau-recapitulatifs" title="Link to this heading"></a></h2>
@@ -234,17 +219,21 @@ docker<span class="w"> </span>compose<span class="w"> </span>logs<span class="w"
</tr>
</thead>
<tbody>
<tr class="row-even"><td><p>.82</p></td>
<tr class="row-even"><td><p>.83</p></td>
<td><p>3849</p></td>
<td><p>Dashboard FastAPI cosma-qc</p></td>
</tr>
<tr class="row-odd"><td><p>.84 / .87</p></td>
<td><p>8100+N</p></td>
<td><p>Viser viewer (reconstruction job N)</p></td>
<tr class="row-odd"><td><p>.83</p></td>
<td><p>1194/udp</p></td>
<td><p>OpenVPN server</p></td>
</tr>
<tr class="row-even"><td><p>.84 / .87</p></td>
<td><p>8100+N</p></td>
<td><p>Viser viewer (job N)</p></td>
</tr>
<tr class="row-odd"><td><p>.84 / .87</p></td>
<td><p>8300</p></td>
<td><p>HTTP server GLB export</p></td>
<td><p>HTTP server GLB/PLY export</p></td>
</tr>
</tbody>
</table>