2026-04-16 01:19:53 +00:00
2026-04-16 01:19:53 +00:00

LingBot-Map: Geometric Context Transformer for Streaming 3D Reconstruction


Quick Start

Installation

1. Create conda environment

conda create -n lingbot-map python=3.10 -y
conda activate lingbot-map

2. Install PyTorch (CUDA 12.8)

pip install torch==2.9.1 torchvision==0.24.1 --index-url https://download.pytorch.org/whl/cu128

For other CUDA versions, see PyTorch Get Started.

3. Install lingbot-map

pip install -e .

4. Install FlashInfer (recommended)

FlashInfer provides paged KV cache attention for efficient streaming inference:

# CUDA 12.8 + PyTorch 2.9
pip install flashinfer-python -i https://flashinfer.ai/whl/cu128/torch2.9/

For other CUDA/PyTorch combinations, see FlashInfer installation. If FlashInfer is not installed, the model falls back to SDPA (PyTorch native attention) via --use_sdpa.

5. Visualization dependencies (optional)

pip install -e ".[vis]"

Demo

Streaming Inference from Images

python demo.py --model_path /path/to/checkpoint.pt \
    --image_folder /path/to/images/

Streaming Inference from Video

python demo.py --model_path /path/to/checkpoint.pt \
    --video_path video.mp4 --fps 10

Streaming with Keyframe Interval

Use --keyframe_interval to reduce KV cache memory by only keeping every N-th frame as a keyframe. Non-keyframe frames still produce predictions but are not stored in the cache. This is useful for long sequences which excesses 320 frames.

python demo.py --model_path /path/to/checkpoint.pt \
    --image_folder /path/to/images/ --keyframe_interval 6

Windowed Inference (for long sequences, >3000 frames)

python demo.py --model_path /path/to/checkpoint.pt \
    --video_path video.mp4 --fps 10 \
    --mode windowed --window_size 64

With Sky Masking

python demo.py --model_path /path/to/checkpoint.pt \
    --image_folder /path/to/images/ --mask_sky

Without FlashInfer (SDPA fallback)

python demo.py --model_path /path/to/checkpoint.pt \
    --image_folder /path/to/images/ --use_sdpa

Model Download

Model Name Huggingface Repository Description
lingbot-map Base model checkpoint

License

This project is released under the Apache License 2.0. See LICENSE file for details.

Citation

@article{lingbot-map2026,
  title={},
  author={},
  journal={arXiv preprint arXiv:},
  year={2026}
}

Acknowledgments

This work builds upon several excellent open-source projects:


Description
Mirror of HF robbyant/lingbot-map — model checkpoints (LFS)
Readme 22 GiB