Mixture of Experts

AI on IBM z17, Meta's Llama 4 and Google Cloud Next 2025

10 snips
Apr 11, 2025
Join IBM trailblazers Hillery Hunter, an IBM Fellow and CTO of IBM Infrastructure, Shobhit Varshney, Head of Data and AI for the Americas, and Kate Soule, Director of Technical Product Management at Granite, as they dive into the launch of IBM z17 with its cutting-edge AI capabilities. Explore the unveiling of Meta's Llama 4, the innovations at Google Cloud Next, and the evolving perceptions of AI from the Pew Research Center. They tackle everything from zero downtime in financial transactions to AI's role in entertainment and industry dynamics.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Real-time LLM Integration

  • Shobhit Varshney shared an anecdote about a credit card company struggling with fraud detection due to LLM latency.
  • The new Z series mainframe addresses this by enabling real-time LLM integration into transaction flows.
INSIGHT

Llama 4's Open Source Impact

  • Meta's release of Llama 4, including a large mixture-of-experts model, may put pressure on closed AI labs.
  • Kate Soule suggests this could lead to wider community support for this architecture.
INSIGHT

Mixture of Experts Efficiency

  • Mixture of Experts architecture offers inference efficiency, particularly at low batch sizes.
  • Kate Soule emphasizes the need for broader community support to enhance its tooling and application.
Get the Snipd Podcast app to discover more snips from this episode
Get the app