
Building brains for bulldozers
The Stack Overflow Podcast
00:00
Pretrained vision models and fine-tuning
Kevin discusses using vision foundation models and video encoders then fine-tuning with first-person construction data.
Play episode from 15:27
Transcript


