Files

1014 B
Raw Permalink Blame History

Data Pipeline (Acquisition → Transfer → Storage)

This document mirrors the /data page on the website and is meant to be versioned and reviewed.

Scenario A (default)

  • Raw 192 kHz audio continuously when 4G throughput is OK.
  • If throughput is low, use an adaptive strategy (compression / denoise / clips / features).

Acquisition (edge)

  • Sample rate (recommended): 192 kHz
  • Bit depth: 24-bit
  • Channels: mono

Raw rate (order of magnitude)

  • 192000 samples/s × 24-bit ≈ 576 kB/s (~0.56 MB/s)
  • ~2.0 GB/hour
  • ~49 GB/day

Transfer (4G)

  • Raw 192 kHz mono 24-bit ≈ ~4.6 Mbps (payload only, no overhead)

Storage

Per station:

  • ~49 GB/day
  • ~343 GB/week
  • ~18 TB/year

Optimizations (planned as feature branches)

  • feat/data-compression (FLAC/Zstd, chunking)
  • feat/data-downsample (192k→48k)
  • feat/data-vad-events (VAD/events)
  • feat/data-features-only (FFT/SPL/peaks only)
  • feat/data-retention-policy (retention + aggregation)