
This Day in AI Podcast EP49: Our Big Announcement + GPT-4 Update, Code Llama, LLaVA-1.6, YOLO World, EAGLE-7B & Bard Images
Feb 2, 2024
The podcast discusses the new ThisDayinAI.com community website. They cover the latest GPT-4 updates and Code Llama's open-source release. They explore the capabilities of the LLaVA-1.6 release. They also discuss YOLO World and the impact of EAGLE-7B and RWKV Language Models. Finally, they talk about Bard's new image creation feature and censorship.
AI Snips
Chapters
Transcript
Episode notes
LLaVA 1.6 Narrows Vision Gap
- LLaVA-1.6 significantly improves vision reasoning and OCR, closing the gap with GPT-4 Vision on many tasks.
- Open-source vision models are rapidly reaching practical parity for everyday image interpretation.
YOLO-World Enables Natural-Language Vision
- YOLO-World maps natural-language labels to vision outputs, enabling open-vocabulary object detection from everyday prompts.
- This reduces the need to match model training labels and improves UI navigation and real-time object identification.
Risky YOLO Tests Reveal Ethical Hazards
- Chris ran extreme tests on YOLO-World, including prompting with sensitive labels, and the model confidently selected individuals.
- He warns that such capabilities enable harmful deployments like false accusations if used irresponsibly.
