init: seed content from candidature + mvp docs
This commit is contained in:
33
content/docs-mvp/DATA.md
Normal file
33
content/docs-mvp/DATA.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Data Pipeline (Acquisition → Transfer → Storage)
|
||||
|
||||
This document mirrors the **/data** page on the website and is meant to be versioned and reviewed.
|
||||
|
||||
## Scenario A (default)
|
||||
- **Raw 192 kHz audio continuously** when 4G throughput is OK.
|
||||
- If throughput is low, use an **adaptive strategy** (compression / denoise / clips / features).
|
||||
|
||||
## Acquisition (edge)
|
||||
- Sample rate (recommended): **192 kHz**
|
||||
- Bit depth: **24-bit**
|
||||
- Channels: **mono**
|
||||
|
||||
### Raw rate (order of magnitude)
|
||||
- 192000 samples/s × 24-bit ≈ **576 kB/s** (~0.56 MB/s)
|
||||
- ~**2.0 GB/hour**
|
||||
- ~**49 GB/day**
|
||||
|
||||
## Transfer (4G)
|
||||
- Raw 192 kHz mono 24-bit ≈ **~4.6 Mbps** (payload only, no overhead)
|
||||
|
||||
## Storage
|
||||
Per station:
|
||||
- ~49 GB/day
|
||||
- ~343 GB/week
|
||||
- ~18 TB/year
|
||||
|
||||
## Optimizations (planned as feature branches)
|
||||
- `feat/data-compression` (FLAC/Zstd, chunking)
|
||||
- `feat/data-downsample` (192k→48k)
|
||||
- `feat/data-vad-events` (VAD/events)
|
||||
- `feat/data-features-only` (FFT/SPL/peaks only)
|
||||
- `feat/data-retention-policy` (retention + aggregation)
|
||||
Reference in New Issue
Block a user