Behind the scenes of Google's state-of-the-art "nano-banana" image model

49 snips

Aug 27, 2025

Guest

Mostafa Dehghani

Guest

Nicole Brichtova

Nicole Brichtova and Mostafa Dehghani from Google's Gemini team dive into the innovative features of their cutting-edge image model, Gemini 2.5 Flash. They discuss how the model enables intricate edits through interleaved generation and its ability to maintain character consistency. Listeners learn about the playful 'nano-banana' concept, showcasing real-time transformations that enhance user engagement. The duo also reflects on the integration of text rendering and user feedback, paving the way for future advancements in image generation technology.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ADVICE

Use Clear Signals For Iteration

Track a clear failure signal to guide model improvements and prevent regressions.
Use measurable proxies like text rendering when human preference labels are costly or slow.

INSIGHT

Positive Transfer Between Understanding And Generation

Image understanding and generation positively transfer when trained together in a multimodal model.
Visual data provides shortcuts to world knowledge that text alone may miss.

ANECDOTE

Five 1980s Glamour Variants Demo

Nicole asked Gemini to produce five 1980s glamour mall variations and it returned consistent characterful options.
The model labeled each variant and maintained a recognizable subject across styles.

Get the Snipd Podcast app to discover more snips from this episode

Get the app