Ep#20 VideoMimic

Jul 13, 2025

Arthur Allshire and Hongsuk Choi, both PhD students at UC Berkeley, dive into their groundbreaking project, VideoMimic. They discuss how humanoid robots can learn locomotion and interaction through human imitation. Key insights include advancements in 3D reconstruction from videos, the challenges of kinematic retargeting, and the integration of depth mapping technologies. They also touch on the complexity of training robots in diverse environments and the exciting potential of using minimal video data for effective robotic training.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ADVICE

Automatic Scene Scaling

Scale the scene mesh down to the robot size to ensure physically feasible motions.
Use automatic scene scaling to reduce embodiment gaps and aid imitation learning.

INSIGHT

Limitations of Internet Videos

Internet videos often fail for reconstruction due to uncontrolled conditions and visual artifacts.
Phone-recorded videos provide better reliability, simplifying data collection for robotics learning.

INSIGHT

Pose Alignment with SLAM

Align 2D pose detections with 3D SLAM point clouds by lifting joint keypoints to 2.5D.
This allows linking the SMPL human model joints to scene geometry effectively.

Get the Snipd Podcast app to discover more snips from this episode

Get the app