34 lines
1014 B
Markdown
34 lines
1014 B
Markdown
# Data Pipeline (Acquisition → Transfer → Storage)
|
||
|
||
This document mirrors the **/data** page on the website and is meant to be versioned and reviewed.
|
||
|
||
## Scenario A (default)
|
||
- **Raw 192 kHz audio continuously** when 4G throughput is OK.
|
||
- If throughput is low, use an **adaptive strategy** (compression / denoise / clips / features).
|
||
|
||
## Acquisition (edge)
|
||
- Sample rate (recommended): **192 kHz**
|
||
- Bit depth: **24-bit**
|
||
- Channels: **mono**
|
||
|
||
### Raw rate (order of magnitude)
|
||
- 192000 samples/s × 24-bit ≈ **576 kB/s** (~0.56 MB/s)
|
||
- ~**2.0 GB/hour**
|
||
- ~**49 GB/day**
|
||
|
||
## Transfer (4G)
|
||
- Raw 192 kHz mono 24-bit ≈ **~4.6 Mbps** (payload only, no overhead)
|
||
|
||
## Storage
|
||
Per station:
|
||
- ~49 GB/day
|
||
- ~343 GB/week
|
||
- ~18 TB/year
|
||
|
||
## Optimizations (planned as feature branches)
|
||
- `feat/data-compression` (FLAC/Zstd, chunking)
|
||
- `feat/data-downsample` (192k→48k)
|
||
- `feat/data-vad-events` (VAD/events)
|
||
- `feat/data-features-only` (FFT/SPL/peaks only)
|
||
- `feat/data-retention-policy` (retention + aggregation)
|