diff --git a/README.md b/README.md index e35a731..eb8fd45 100644 --- a/README.md +++ b/README.md @@ -1,23 +1,38 @@ -

LingBot-Map: Geometric Context Transformer for Streaming 3D Reconstruction

- -

- - - - - - -

- -

+

-

+ +

LingBot-Map: Geometric Context Transformer for Streaming 3D Reconstruction

+ +Robbyant Team + +
+ +
+ +[![Paper](https://img.shields.io/static/v1?label=Paper&message=arXiv&color=red&logo=arxiv)](https://arxiv.org/abs/2604.14141) +[![PDF](https://img.shields.io/static/v1?label=Paper&message=PDF&color=red&logo=adobeacrobatreader)](lingbot-map_paper.pdf) +[![Project](https://img.shields.io/badge/Project-Website-blue)](https://technology.robbyant.com/lingbot-map) +[![HuggingFace](https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Model&message=HuggingFace&color=orange)](https://huggingface.co/robbyant/lingbot-map) +[![ModelScope](https://img.shields.io/static/v1?label=%F0%9F%A4%96%20Model&message=ModelScope&color=purple)](https://www.modelscope.cn/models/Robbyant/lingbot-map) +[![License](https://img.shields.io/badge/License-Apache--2.0-green)](LICENSE.txt) + +
https://github.com/user-attachments/assets/fe39e095-af2c-4ec9-b68d-a8ba97e505ab +----- + +πŸ—ΊοΈ Meet LingBot-Map! We've built a feed-forward 3D foundation model for streaming 3D reconstruction! πŸ—οΈπŸŒ + +LingBot-Map has focused on: + +- **Geometric Context Transformer**: Architecturally unifies coordinate grounding, dense geometric cues, and long-range drift correction within a single streaming framework through anchor context, pose-reference window, and trajectory memory. +- **High-Efficiency Streaming Inference**: A feed-forward architecture with paged KV cache attention, enabling stable inference at ~20 FPS on 518Γ—378 resolution over long sequences exceeding 10,000 frames. +- **State-of-the-Art Reconstruction**: Superior performance on diverse benchmarks compared to both existing streaming and iterative optimization-based approaches. + --- -# Quick Start +# βš™οΈ Quick Start ## Installation @@ -60,13 +75,13 @@ pip install flashinfer-python -i https://flashinfer.ai/whl/cu128/torch2.9/ pip install -e ".[vis]" ``` -# Model Download +# πŸ“¦ Model Download | Model Name | Huggingface Repository | ModelScope Repository | Description | | :--- | :--- | :--- | :--- | | lingbot-map | [robbyant/lingbot-map](https://huggingface.co/robbyant/lingbot-map) | [Robbyant/lingbot-map](https://www.modelscope.cn/models/Robbyant/lingbot-map) | Base model checkpoint (4.63 GB) | -# Demo +# 🎬 Demo ### Streaming Inference from Images @@ -114,11 +129,11 @@ python demo.py --model_path /path/to/checkpoint.pt \ --image_folder /path/to/images/ --use_sdpa ``` -# License +# πŸ“œ License This project is released under the Apache License 2.0. See [LICENSE](LICENSE.txt) file for details. -# Citation +# πŸ“– Citation ```bibtex @article{chen2026geometric, @@ -129,7 +144,7 @@ This project is released under the Apache License 2.0. See [LICENSE](LICENSE.txt } ``` -# Acknowledgments +# ✨ Acknowledgments This work builds upon several excellent open-source projects: