RoboPapers

Ep#20 VideoMimic

Jul 13, 2025
Arthur Allshire and Hongsuk Choi, both PhD students at UC Berkeley, dive into their groundbreaking project, VideoMimic. They discuss how humanoid robots can learn locomotion and interaction through human imitation. Key insights include advancements in 3D reconstruction from videos, the challenges of kinematic retargeting, and the integration of depth mapping technologies. They also touch on the complexity of training robots in diverse environments and the exciting potential of using minimal video data for effective robotic training.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Automatic Scene Scaling

  • Scale the scene mesh down to the robot size to ensure physically feasible motions.
  • Use automatic scene scaling to reduce embodiment gaps and aid imitation learning.
INSIGHT

Limitations of Internet Videos

  • Internet videos often fail for reconstruction due to uncontrolled conditions and visual artifacts.
  • Phone-recorded videos provide better reliability, simplifying data collection for robotics learning.
INSIGHT

Pose Alignment with SLAM

  • Align 2D pose detections with 3D SLAM point clouds by lifting joint keypoints to 2.5D.
  • This allows linking the SMPL human model joints to scene geometry effectively.
Get the Snipd Podcast app to discover more snips from this episode
Get the app