MLOps.community  cover image

Large Language Models in Production Round-table Conversation

MLOps.community

00:00

The Importance of Re-Architecting Production

State of the art right now is a couple milliseconds I think actually like there was a paper that talked about state of the art in kind of like what I call in a lab environment was a 29 millisecond and inference pass. If you're just using it for your koi app at you know and on your computer then I wouldn't consider that production so rewinding a little bit we'll go back to you know what is it going to take we're still on the posubgic of latency.

Play episode from 33:54
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app