Egocentric Hand Reconstruction#
Automated pipeline for 4D hand and camera pose reconstruction from egocentric videos. Integrates ViPE and Dyn-HaMR in containerized environments.
Video Capture#
To capture egocentric video with an OAK camera, see the OAK camera plugin documentation.
Setup#
System Requirement#
OS: Ubuntu 24.04
GPU: NVIDIA RTX 6000 Ada or L40
Memory: 100GB (for a reference 30s video, more for longer)
Storage: 100GB
Prepare data files#
Place required files in the outputs/ directory.
...
├── doc/
├── docker/
├── scripts/
├── ...
└── outputs/
├── MANO_RIGHT.pkl
└── BMC/
└── *.npy
MANO model (required):
Download from: https://mano.is.tue.mpg.de/
Place:
outputs/MANO_RIGHT.pkl
BMC data (required):
Follow the README in MengHao666/Hand-BMC-pytorch to generate (until the step
python calculate_bmc.py)Place all
.npyfiles in:outputs/BMC/
Note
The Hand-BMC-pytorch repository is no longer actively maintained, so parts
of its setup may not work out-of-the-box on newer systems. At the time of
writing, the environment.yml pins PyTorch to a specific build
(py3.7_cuda10.0.130_cudnn7.6.2_0) that may no longer be available on
Conda channels or compatible with current hardware. If Conda fails to
resolve the environment, one workaround is to relax the pins in
environment.yml:
# Before
- pytorch==1.2.0=py3.7_cuda10.0.130_cudnn7.6.2_0
- torchvision==0.4.0=py37_cu100
# After
- pytorch=1.2.0
- torchvision=0.4.0
This fix reflects the state of the upstream repo at the time of writing and may need to be adjusted as the ecosystem evolves.
Build Docker images#
./docker/vipe.sh build
./docker/dynhamr.sh build
Note
Building these Docker images pulls third-party source code, libraries, and pre-trained model weights from external repositories. These components are subject to their own respective licenses, which may include restrictions on use, modification, or redistribution. It is the user’s responsibility to review and comply with all applicable third-party licenses before building, using, or distributing these images. Refer to each Dockerfile for the specific sources pulled during the build.
Hand Reconstruction#
Run complete reconstruction (ViPE + Dyn-HaMR) with a single command:
# Using a local video file
./scripts/run_reconstruction.sh path/to/your_video.mp4
# Using a remote video file
./scripts/run_reconstruction.sh s3://path/to/your_video.mp4
The script accepts either a local file path or a s3:// URL
pointing to a video on a S3-compatible cloud storage. When a URL is provided,
the video is automatically downloaded to the outputs/ directory before
processing begins.
To use a remote video, set the following environment variables for credentials:
Variable |
Required |
Description |
|---|---|---|
|
Yes |
Your S3 access key ID |
|
Yes |
Your S3 access key |
|
No |
Region (default: |
|
No |
Custom endpoint for S3-compatible storage |
By default, the pipeline reads data files from and writes results to the
outputs/ directory. Set OUTPUTS_DIR to use a different location:
OUTPUTS_DIR=/path/to/outputs ./scripts/run_reconstruction.sh path/to/your_video.mp4
The pipeline will:
Copy or download the video to
outputs/.Run ViPE to estimate camera poses.
Run Dyn-HaMR for hand reconstruction.
Save all results to
outputs/logs/.
View results#
# List results
ls outputs/logs/video-custom/<DATE>/<VIDEO_NAME>*/
# View visualization
vlc outputs/logs/video-custom/<DATE>/<VIDEO_NAME>*/*_grid.mp4