mirror of
https://huggingface.co/robbyant/lingbot-map
synced 2026-04-25 23:12:48 +00:00
Upload folder using huggingface_hub
This commit is contained in:
1
.gitattributes
vendored
1
.gitattributes
vendored
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
assets/teaser.png filter=lfs diff=lfs merge=lfs -text
|
||||
|
||||
146
README.md
146
README.md
@@ -1,3 +1,143 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
---
|
||||
<h1 align="center">LingBot-Map: Geometric Context Transformer for Streaming 3D Reconstruction</h1>
|
||||
|
||||
<p align="center">
|
||||
<a href="lingbot-map_paper.pdf"><img src="https://img.shields.io/static/v1?label=Paper&message=PDF&color=red&logo=arxiv"></a>
|
||||
<a href="https://technology.robbyant.com/lingbot-map"><img src="https://img.shields.io/badge/Project-Website-blue"></a>
|
||||
<a href=""><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Model&message=HuggingFace&color=orange"></a>
|
||||
<a href="LICENSE.txt"><img src="https://img.shields.io/badge/License-Apache--2.0-green"></a>
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<img src="assets/teaser.png" width="100%">
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<video src="https://gw.alipayobjects.com/v/huamei_vaouhm/afts/video/q0sdTr9Mm6IAAAAAmyAAAAgADglFAQJr" width="100%" autoplay loop muted playsinline></video>
|
||||
</p>
|
||||
|
||||
---
|
||||
|
||||
# Quick Start
|
||||
|
||||
## Installation
|
||||
|
||||
**1. Create conda environment**
|
||||
|
||||
```bash
|
||||
conda create -n lingbot-map python=3.10 -y
|
||||
conda activate lingbot-map
|
||||
```
|
||||
|
||||
**2. Install PyTorch (CUDA 12.8)**
|
||||
|
||||
```bash
|
||||
pip install torch==2.9.1 torchvision==0.24.1 --index-url https://download.pytorch.org/whl/cu128
|
||||
```
|
||||
|
||||
> For other CUDA versions, see [PyTorch Get Started](https://pytorch.org/get-started/locally/).
|
||||
|
||||
**3. Install lingbot-map**
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
**4. Install FlashInfer (recommended)**
|
||||
|
||||
FlashInfer provides paged KV cache attention for efficient streaming inference:
|
||||
|
||||
```bash
|
||||
# CUDA 12.8 + PyTorch 2.9
|
||||
pip install flashinfer-python -i https://flashinfer.ai/whl/cu128/torch2.9/
|
||||
```
|
||||
|
||||
> For other CUDA/PyTorch combinations, see [FlashInfer installation](https://docs.flashinfer.ai/installation.html).
|
||||
> If FlashInfer is not installed, the model falls back to SDPA (PyTorch native attention) via `--use_sdpa`.
|
||||
|
||||
**5. Visualization dependencies (optional)**
|
||||
|
||||
```bash
|
||||
pip install -e ".[vis]"
|
||||
```
|
||||
|
||||
# Demo
|
||||
|
||||
## Streaming Inference from Images
|
||||
|
||||
```bash
|
||||
python demo.py --model_path /path/to/checkpoint.pt \
|
||||
--image_folder /path/to/images/
|
||||
```
|
||||
|
||||
## Streaming Inference from Video
|
||||
|
||||
```bash
|
||||
python demo.py --model_path /path/to/checkpoint.pt \
|
||||
--video_path video.mp4 --fps 10
|
||||
```
|
||||
|
||||
## Streaming with Keyframe Interval
|
||||
|
||||
Use `--keyframe_interval` to reduce KV cache memory by only keeping every N-th frame as a keyframe. Non-keyframe frames still produce predictions but are not stored in the cache. This is useful for long sequences
|
||||
which excesses 320 frames.
|
||||
|
||||
```bash
|
||||
python demo.py --model_path /path/to/checkpoint.pt \
|
||||
--image_folder /path/to/images/ --keyframe_interval 6
|
||||
```
|
||||
|
||||
## Windowed Inference (for long sequences, >3000 frames)
|
||||
```bash
|
||||
python demo.py --model_path /path/to/checkpoint.pt \
|
||||
--video_path video.mp4 --fps 10 \
|
||||
--mode windowed --window_size 64
|
||||
```
|
||||
|
||||
|
||||
## With Sky Masking
|
||||
|
||||
```bash
|
||||
python demo.py --model_path /path/to/checkpoint.pt \
|
||||
--image_folder /path/to/images/ --mask_sky
|
||||
```
|
||||
|
||||
## Without FlashInfer (SDPA fallback)
|
||||
|
||||
```bash
|
||||
python demo.py --model_path /path/to/checkpoint.pt \
|
||||
--image_folder /path/to/images/ --use_sdpa
|
||||
```
|
||||
|
||||
# Model Download
|
||||
|
||||
<!-- TODO: fill in model checkpoints -->
|
||||
|
||||
| Model Name | Huggingface Repository | Description |
|
||||
| :--- | :--- | :--- |
|
||||
| lingbot-map | | Base model checkpoint |
|
||||
|
||||
|
||||
# License
|
||||
|
||||
This project is released under the Apache License 2.0. See [LICENSE](LICENSE.txt) file for details.
|
||||
|
||||
# Citation
|
||||
|
||||
```bibtex
|
||||
@article{lingbot-map2026,
|
||||
title={},
|
||||
author={},
|
||||
journal={arXiv preprint arXiv:},
|
||||
year={2026}
|
||||
}
|
||||
```
|
||||
|
||||
# Acknowledgments
|
||||
|
||||
This work builds upon several excellent open-source projects:
|
||||
|
||||
- [VGGT](https://github.com/facebookresearch/vggt)
|
||||
- [DINOv2](https://github.com/facebookresearch/dinov2)
|
||||
- [Flashinfer](https://github.com/flashinfer-ai/flashinfer)
|
||||
|
||||
---
|
||||
BIN
assets/teaser.png
LFS
Normal file
BIN
assets/teaser.png
LFS
Normal file
Binary file not shown.
BIN
lingbot-map.pt
LFS
Normal file
BIN
lingbot-map.pt
LFS
Normal file
Binary file not shown.
Reference in New Issue
Block a user