
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) Trends in Computer Vision with Amir Zamir - #338
Jan 13, 2020
Amir Zamir, an Assistant Professor of Computer Science at the Swiss Federal Institute of Technology, dives into the exciting advancements in computer vision. He discusses how the field has evolved, particularly in 3D vision and self-supervised learning, which reduces reliance on labeled data. The conversation touches on the challenges of navigating unseen spaces for robotics and the significance of multitask learning for improving network robustness. Zamir also explores the practical applications of these technologies, including their potential for autonomous driving and real-world problem-solving.
AI Snips
Chapters
Transcript
Episode notes
Vision Plus X
- A key trend in CV is the blending of vision with other areas like graphics, robotics, and adversarial robustness.
- This reflects the understanding that vision serves a practical purpose, often tied to a downstream goal.
Vision vs. Pixels
- Vision is about using priors about the world (like 3D structure) to process raw pixels and create abstractions.
- This contrasts with directly learning from raw pixels, which can be inefficient and lack generalizability.
Real-World Data
- Simulators using scanned real-world buildings (like Gibson) provide better data for vision in robotics research than purely synthetic data.
- This offers more realistic visual data and semantics, bridging the gap between static datasets and active agents.

