This Day in AI Podcast

EP72: Croc Test with Gemini 1.5 Experimental, Flux Destroys Midjourney & GPT4o Model Updates

14 snips
Aug 7, 2024
Dive into the intriguing world of AI as the hosts tackle Google's Gemini 1.5 model, discussing its crocodile video analysis capabilities and performance challenges. They compare AI models like Flux and MidJourney, revealing Flux's superiority in image generation. Exciting updates on OpenAI's GPT-4 model highlight structured outputs and cost reductions. The conversation wraps up with insights into the current AI development landscape, emphasizing the need for reliable tools in an increasingly competitive market.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Gemini Video Test Results

  • Mike tested Gemini 1.5 Pro Experimental with his crocodile video, asking about events and people.
  • While identifying key takeaways, it hallucinated names of people in the video.
ANECDOTE

Claude's Performance on Croc Video

  • Claude 3.5 Sonnet, given an audio transcript, accurately identified the location and key details.
  • It avoided hallucinations and provided more accurate information than Gemini, even without video.
INSIGHT

Gemini's Hallucination Problem

  • Despite its large context window, Gemini's underlying model is prone to hallucinations.
  • Claude performs better at video interpretation using audio transcripts, suggesting Gemini prioritizes visuals over audio.
Get the Snipd Podcast app to discover more snips from this episode
Get the app