What is PSHuman?
PSHuman is an innovative framework that uses multi-view diffusion models and explicit reconstruction techniques to generate realistic 3D human models from a single image. This technology is particularly useful for handling complex self-occlusion issues and avoids geometric distortions in facial details. By jointly modeling global body shapes and local facial features across different scales, PSHuman creates detailed new perspective views while maintaining identity characteristics. Additionally, it enhances cross-view body shape consistency in various poses through the use of parameterized models like SMPL-X.
Target Audience:
The target audience includes 3D modelers, game developers, animators, and professionals in virtual reality. PSHuman helps these users quickly and efficiently create high-quality 3D content for applications such as game character design, animation, and VR experiences.
Example Scenarios:
1. Game developers can use PSHuman to rapidly generate realistic game character models.
2. Animators can utilize PSHuman to create precise 3D character models for animated films.
3. Virtual reality companies can employ PSHuman to develop authentic virtual characters for VR experiences.
Key Features:
3D Human Reconstruction from Single Image: Generates realistic 3D human models from a single image.
Multi-View Diffusion Across Scales: Jointly models global body shapes and local facial features for rich detail in new perspectives.
Explicit Reconstruction Techniques: Uses multi-view normals and color images with differentiable rasterization to restore realistic textured human meshes.
Consistent Body Pose: Enhances cross-view body shape consistency in various poses using body priors from models like SMPL-X.
Highly Realistic Textures: Extensive experiments on CAPE and THuman2.1 datasets demonstrate superior performance in geometry, texture fidelity, and generalization.
Versatile Application: Suitable for both real and non-real characters like anime figures.
Tutorial:
1. Prepare a full-body image.
2. Use the PSHuman framework by inputting the full-body image and predicting the SMPL-X model.
3. Generate six-view global body images and local facial images through multi-view image diffusion models.
4. Use generated normal and color maps to deform and reconstruct the SMPL-X model with explicit human carving techniques initialized by SMPLX.
5. Optimize the model using differentiable rasterization to match the input image.
6. Generate the final realistic 3D human model.