
293: True Personalization Through Reinforcement Learning
Super Data Science: ML & AI Podcast with Jon Krohn
00:00
How to Minimize the Puppy Period
This is this period of time where the reinforcement learning agent learn to a point that the behaviors start becoming acceptable. And so inevitably, in this example with your speaker, you would have to be patient, right? You'd have to go through situations when you're just sitting and the volume goes up and you have to click a button or say something to enact that punishment. But it will need to explore before it can actually come up with the right solution. So how long would it take for that exploration to occur? Exactly. Some applications you don't care about like how long it takes. It's usually it might be an order of minutes or hours and you can basically let the agent learn and
Play episode from 43:57
Transcript


