Long-LRM is a model for 3D Gaussian reconstruction that can reconstruct large scenes from a series of input images. The model can process 32 960x540-resolution source images in 1.3 seconds and run only on a single A100 80G GPU. It combines the latest Mamba2 module with traditional transformer modules, and improves efficiency through efficient token merging and Gaussian pruning steps. Compared with traditional feedforward models, Long-LRM is able to reconstruct the entire scene at once, rather than just a small part of the scene. On large-scale scenario datasets, such as DL3DV-140 and Tanks and Temples, Long-LRM 's performance is comparable to optimization-based approaches, while improving efficiency by two orders of magnitude.
Demand population:
"The target audience is 3D modelers, game developers, virtual reality content creators, and any professionals who need fast and efficient 3D scene reconstruction. Long-LRM 's high efficiency and high-quality reconstruction capabilities enable these users to create realistic 3D scenes in a short time, accelerate product development processes and improve work efficiency."
Example of usage scenarios:
Use Long-LRM to quickly reconstruct 3D urban models from a series of urban street scene images.
In game development, Long-LRM is used to reconstruct the game scene from real-time pictures to improve the realism of the scene.
Virtual reality content creators use Long-LRM to reconstruct a high-precision virtual environment from pictures taken from multiple angles.
Product Features:
Process up to 32 high-resolution input images for fast 3D scene reconstruction
Adopt a hybrid architecture of Mamba2 blocks and transformer blocks to improve token processing capabilities
Balance reconstruction quality and efficiency through token merging and Gaussian pruning steps
A single feedforward step can rebuild the entire scene without multiple iterations
Has performance comparable to optimization methods on large-scale scenario datasets
Improves efficiency of two orders of magnitude and significantly reduces computing resource consumption
Supports extensive view coverage and high-quality photo-level realistic reconstruction
Tutorials for use:
1. Prepare a series of input images for the scene to be reconstructed with a resolution of at least 960x540.
2. Make sure you have compatible GPU hardware, such as the A100 80G GPU.
3. Load the input image and the Long-LRM model into the computing environment.
4. Configure model parameters, including token merge policy and Gaussian pruning threshold.
5. Run the Long-LRM model, wait for the model to process the input image and generate the 3D reconstruction result.
6. View and evaluate the rebuilt 3D scene, post-processing and optimization as needed.
7. Apply the rebuilt 3D scene to the required fields such as 3D printing, virtual reality or game development.