# Data Pipeline (Acquisition → Transfer → Storage) This document mirrors the **/data** page on the website and is meant to be versioned and reviewed. ## Scenario A (default) - **Raw 192 kHz audio continuously** when 4G throughput is OK. - If throughput is low, use an **adaptive strategy** (compression / denoise / clips / features). ## Acquisition (edge) - Sample rate (recommended): **192 kHz** - Bit depth: **24-bit** - Channels: **mono** ### Raw rate (order of magnitude) - 192000 samples/s × 24-bit ≈ **576 kB/s** (~0.56 MB/s) - ~**2.0 GB/hour** - ~**49 GB/day** ## Transfer (4G) - Raw 192 kHz mono 24-bit ≈ **~4.6 Mbps** (payload only, no overhead) ## Storage Per station: - ~49 GB/day - ~343 GB/week - ~18 TB/year ## Optimizations (planned as feature branches) - `feat/data-compression` (FLAC/Zstd, chunking) - `feat/data-downsample` (192k→48k) - `feat/data-vad-events` (VAD/events) - `feat/data-features-only` (FFT/SPL/peaks only) - `feat/data-retention-policy` (retention + aggregation)