Learning Bayesian Statistics

Alexandre Andorra
undefined
Apr 22, 2020 • 49min

#14 Hidden Markov Models & Statistical Ecology, with Vianey Leos-Barajas

I bet you love penguins, right? The same goes for koalas, or puppies! But what about sharks? Well, my next guest loves sharks — she loves them so much that she works a lot with marine biologists, even though she’s a statistician! Vianey Leos Barajas is indeed a statistician primarily working in the areas of statistical ecology, time series modeling, Bayesian inference and spatial modeling of environmental data. Vianey did her PhD in statistics at Iowa State University and is now a postdoctoral researcher at North Carolina State University.In this episode, she’ll tell us what she’s working on that involves sharks, sheep and other animals! Trying to model animal movements, Vianey often encounters the dreaded multimodal posteriors. She’ll explain why these can be very tricky to estimate, and why ecological data are particularly suited for hidden Markov models and spatio-temporal models — don’t worry, Vianey will explain what these models are in the episode!Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Vianey on Twitter: https://twitter.com/vianey_lbHidden Markov Models in the Stan User's Guide: https://mc-stan.org/docs/2_18/stan-users-guide/hmms-section.htmlTagging Basketball Events with HMM in Stan: https://mc-stan.org/users/documentation/case-studies/bball-hmm.htmlHMMs with Python and PyMC3: https://ericmjl.github.io/bayesian-analysis-recipes/notebooks/markov-models/The Discrete Adjoint Method -- Efficient Derivatives for Functions of Discrete Sequences (Betancourt, Margossian, Leos-Barajas): https://arxiv.org/abs/2002.00326Vianey will be doing an HMM 90-minute introduction at the International Statistical Ecology Conference in June 2020: http://www.isec2020.org/Stan for Ecology -- a website for the ecology community in Stan: https://stanecology.github.io/LatinR 2020 -- 7th to 9th October 2020: https://latin-r.com/Migramar -- Science for the Conservation of Marine Migratory Species in the Eastern Pacific: http://migramar.org/hi/en/Pelagios Kakunja -- Know, educate and conserve for a sustainable sea: https://www.pelagioskakunja.org/Book recommendations:Hidden Markov Models for Time Series: https://www.routledge.com/Hidden-Markov-Models-for-Time-Series-An-Introduction-Using-R-Second-Edition/Zucchini-MacDonald-Langrock/p/book/9781482253832Handbook of Mixture Analysis:
undefined
Apr 8, 2020 • 44min

#13 Building a Probabilistic Programming Framework in Julia, with Chad Scherrer

How is Julia doing? I’m talking about the programming language, of course! What does the probabilistic programming landscape in Julia look like? What are Julia’s distinctive features, and when would it be interesting to use it?To talk about that, I invited Chad Scherrer. Chad is a Senior Research Scientist at RelationalAI, a company that uses Artificial Intelligence technologies to solve business problems.Coming from a mathematics background, Chad did his PhD at Indiana University of Bloomington and has been working in statistics and data science for a decade now. Through this experience, he’s been using and developing probabilistic programming languages – so he’s familiar with python, R, PyMC, Stan and all the blockbusters of the field. But since 2018, he’s particularly interested in Julia and developed Soss, an open-source lightweight probabilistic programming package for Julia. In this episode, he’ll tell us why he decided to create this package, and which choices he made that made Soss what it is today. But we’ll also talk about other projects in Julia, like Turing or Gen for instance.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Chad's Website: https://cscherrer.github.io/Chad on Twitter: https://twitter.com/ChadScherrerSoss Package: https://github.com/cscherrer/Soss.jlSoss Presentation at 2019 Strata NYC: https://slides.com/cscherrer/2019-09-26-strata#/Passage -- A Parallel Sampler Generator for Hierarchical Bayesian Modeling: https://bit.ly/2UTmaYBDynamic HMC in Julia: https://github.com/tpapp/DynamicHMC.jlAdvanced HMC in Julia: https://github.com/TuringLang/AdvancedHMC.jlMonte Carlo Measurements in Julia: https://github.com/baggepinnen/MonteCarloMeasurements.jlTuring.jl -- Bayesian inference with probabilistic programming: https://turing.ml/dev/Gen.jl -- Probabilistic modeling and inference in Julia: https://www.gen.dev/Etalumis -- Bringing Probabilistic Programming to Scientific Simulators at Scale: https://arxiv.org/abs/1907.03382Omega.jl -- A programming language for causal and probabilistic reasoning: http://www.zenna.org/Omega.jl/latest/JuliaLang -- The Ingredients for a Composable Programming Language: https://white.ucc.asn.au/2020/02/09/whycompositionaljulia.htmlSimpy -- Discrete event simulation for Python:
undefined
Mar 25, 2020 • 47min

#12 Biostatistics and Differential Equations, with Demetri Pananos

Do you know Google Summer of Code? It’s a time of year when students can contribute to open-source software by developing and adding much needed functionalities to the open-source package of their choice. And Demetri Pananos did just that.He did it in 2019 with PyMC3, for which he developed the API for ordinary differential equations. In this episode, he’ll tell us why and how he did that, what he learned from the experience, and what the strengths and weaknesses of the API are in his opinion.Demetri is a Ph.D candidate in Biostatistics at Western University, in Ontario, Canada. His research interests surround machine learning and Bayesian statistics for personalized medicine. He earned his Master’s in Applied Mathematics from The University of Waterloo and is a firm believer in open science, interdisciplinary collaboration, and reproducible research. Other than that, he loves plotting data and drinking IPA beer – well, who doesn’t?”Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Demetri on Twitter: https://twitter.com/PhDemetriDemetri on GitHub: https://github.com/DpananosDemetri's website: https://dpananos.github.io/PyMC3, Probabilistic Programming in Python: https://docs.pymc.io/Chris Bishop, Pattern Recognition and Machine Learning: https://www.amazon.fr/Pattern-Recognition-Machine-Learning-Christopher/dp/0387310738Bayesian Data Analysis (Gelman, Carlin, Stern, Dunson, Vehtari, Rubin): http://www.stat.columbia.edu/~gelman/book/Parallel Plots: https://arviz-devs.github.io/arviz/generated/arviz.plot_parallel.html
undefined
9 snips
Mar 11, 2020 • 58min

#11 Taking care of your Hierarchical Models, with Thomas Wiecki

I bet you already heard about hierarchical models, or multilevel models, or varying-effects models — yeah this type of models has a lot of names! Many people even turn to Bayesian tools to build _exactly_ these models. But what are they? How do you build and use a hierarchical model? What are the tricks and classical traps? And even more important: how do you _interpret_ a hierarchical model?In this episode, Thomas Wiecki will come to the rescue and explain what multilevel models are, how to build them, what their powers are… but also why you should be very careful when building them…Does the name Thomas Wiecki ring a bell? Probably because he’s the host and creator of the PyData Deep Dive Podcast, where he interviews open-source contributors from the Python and Data Science worlds! Thomas is also the VP of Data Science at Quantopian, a crowd-sourced quantitative investment firm that encourages people everywhere to write investment algorithms.Finally, Thomas is a longtime Bayesian and core-developer of PyMC3, a fantastic python package to do probabilistic programming in Python. On his blog, he publishes tutorial articles and explores new ideas such as Bayesian Deep Learning. Caring a lot about open-source software sustainability, he puts all he’s up to on his Patreon page, that you’ll find in the show notes.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Thomas’ series on Hierarchical Regression: https://twiecki.io/blog/2013/08/12/bayesian-glms-1/Non-centered Parametrization with PyMC3: https://twiecki.io/blog/2017/02/08/bayesian-hierchical-non-centered/Using Bayesian Decision Making: https://twiecki.io/blog/2019/01/14/supply_chain/PyMC3 - Probabilistic Programming in Python: https://docs.pymc.io/Symbolic PyMC: https://pymc-devs.github.io/symbolic-pymc/PyData Deep Dive Podcast: https://pydata-podcast.comThomas on Twitter: https://twitter.com/twiecki?lang=enThomas on Patreon: https://www.patreon.com/twieckiThomas on GitHub: https://github.com/twieckiAlex’s Hierarchical Model of Elections in Paris: https://mybinder.org/v2/gh/AlexAndorra/pollsposition_models/master?urlpath=%2Fvoila%2Frender%2Fdistrict-level%2Fmunic_model_analysis.ipynb
undefined
Feb 26, 2020 • 44min

#10 Exploratory Analysis of Bayesian Models, with ArviZ and Ari Hartikainen

How do you handle your MCMC samples once your Bayesian model fit properly? Which diagnostics do you check to see if there was a computational problem? And isn’t that nice when you have beautiful and reliable plots to complement your analysis and better understand your model?I know what you think: plotting can be long and complicated in these cases. Well, not with ArviZ, a platform-agnostic package to do exploratory analysis of your Bayesian models. And in this episode, Ari Hartikainen will tell you why.Ari is a data-scientist in geophysics and a researcher at the Department of Civil Engineering of Aalto University in Finland. He mainly works on geophysics, Bayesian statistics and visualization. Ari’s also a prolific open-source contributor, as he’s a core-developer of the popular Stan and ArviZ libraries. He’ll tell us how PyStan interacts with ArviZ, what he thinks ArviZ most useful features are, and which common difficulties he encounters with his models and data.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Ari on GitHub: https://github.com/ahartikainenAri on Twitter: https://twitter.com/a_hartikainenArviZ -- Exploratory analysis of Bayesian models: https://arviz-devs.github.io/arviz/Introductory paper of ArviZ in The Journal of Open Source Software: https://www.researchgate.net/publication/330402908_ArviZ_a_unified_library_for_exploratory_analysis_of_Bayesian_models_in_PythonStan -- Statistical Modeling Platform: https://mc-stan.org/GPflow -- Gaussian processes in TensorFlow: https://www.gpflow.org/GPy -- Gaussian processes framework in Python: https://sheffieldml.github.io/GPy/
undefined
Feb 12, 2020 • 54min

#9 Exploring the Cosmos with Bayes and Maggie Lieu

Have you always wondered what dark matter is? Can we even see it — let alone measure it? And what would discover it imply for our understanding of the Universe?In this episode, we’ll take look at the cosmos with Maggie Lieu. She’ll tell us what research in astrophysics is made of, what model she worked on at the European Space Agency, and how Bayesian the world of space science is.Maggie Lieu did her PhD in the Astronomy & Space Department of the University of Birmingham. She’s now a Research Fellow of Machine Learning & Cosmology at the University of Nottingham and is working on projects in preparation for Euclid, a space-based telescope whose goal is to map the dark Universe and help us learn about the nature of dark matter and dark energy.In a nutshell, she tries to help us better understand the entire cosmos. Even more amazing, she uses the Stan library and applies Bayesian statistical methods to decipher her astronomical data! But Maggie is not just a Bayesian astrophysicist: she also loves photography and rock-climbing!Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Maggie's Website: https://maggielieu.com/Maggie's Google Scholar Page: https://scholar.google.co.uk/citations?user=ilfwfuUAAAAJ&hl=enMaggie on Twitter: https://twitter.com/Space_MogMaggie on GitHub: https://github.com/MaggieLieuMaggie on YouTube: https://www.youtube.com/channel/UClO6TuRE6XLzbMBmQ_KY38AStan -- Statistical Modeling Platform: https://mc-stan.org/Stan's YouTube Channel: https://www.youtube.com/channel/UCwgN5srGpBH4M-Zc2cAluOA
undefined
Jan 29, 2020 • 49min

#8 Bayesian Inference for Software Engineers, with Max Sklar

What is it like using Bayesian tools when you’re a software engineer or computer scientist? How do you apply these tools in the online ad industry? More generally, what is Bayesian thinking, philosophically? And is it really useful in every day life? Because, well you can’t fire up MCMC each time you need to make a quick decision under uncertainty… So how do you do that in practice, when you have at most a pen and paper?In this episode, you’ll hear Max Sklar’s take on these questions. Max is a software engineer with a focus on machine learning and Bayesian inference. Now working at Foursquare’s innovation lab, he recently led the development of a causality model for Foursquare’s Ad Attribution product and taught a course on Bayesian Thinking at the Lviv Data Science Summer School.Max is also an open-source enthusiast and a fellow podcaster – he’s the host of the Local Maximum podcast, where you can hear every week about the latest trends in AI, machine learning and technology from an engineering perspective.Ow, and if you liked the movie « Her », with Joaquin Phoenix, well you’re in for a treat at the end of this episode…Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Local Maximum podcast website: https://www.localmaxradio.comMax on Twitter: https://twitter.com/maxsklarBayesian linear models: https://github.com/maxsklar/BayesPy/tree/master/LinearModelsBayesian Dirichlet-Multinomial estimation: https://github.com/maxsklar/BayesPy/tree/master/DirichletEstimationBayesian Thinking for Applied Machine Learning slides: https://docs.google.com/presentation/d/1eiceuvXlsoFKoHdqjF3qXBkyht7vR0YXQPG82ady-TU/edit?usp=sharing
undefined
Jan 16, 2020 • 46min

#7 Designing a Probabilistic Programming Language & Debugging a Model, with Junpeng Lao

You can’t study psychology up until your PhD and end-up doing very mathematical and computational data science at Google right? It’s too hard of a U-turn — some would even say it’s NUTS, just because they like bad puns… Well think again, because Junpeng Lao did just that!Before doing data science at Google, Junpeng was a cognitive psychology researcher at the University of Fribourg, Switzerland. Working in Python, Matlab and occasionally in R, Junpeng is a prolific open-source contributor, particularly to the popular TensorFlow and PyMC3 libraries. He also maintains the PyMC Discourse on his free time, where he amazingly answers all kinds of various and very specific questions!In this episode, he’ll tell you what the core characteristics of TensorFlow Probability are, and when you would use TFP instead of another probabilistic programming framework, like Stan or PyMC3. He’ll also explain why PyMC4 will be based on TensorFlow Probability itself, and what future contributions he has in mind for these two amazing libraries. Finally, Junpeng will share with you his workflow for debugging a model, or just for better understanding your models.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show: Junpeng's blog: https://junpenglao.xyz/Junpeng on Twitter: https://twitter.com/junpenglaoJunpeng on GitHub: https://github.com/junpenglaoAdvanced Bayesian Modeling Tutorial: https://discourse.pymc.io/t/advance-bayesian-modelling-with-pymc3/1439Stan Devs' Prior Choice Recommendations: https://github.com/stan-dev/stan/wiki/Prior-Choice-RecommendationsPyMC Discourse: https://discourse.pymc.io/PyMC3 - Probabilistic Programming in Python: https://docs.pymc.io/Tensor Flow Probability: https://www.tensorflow.org/probability/
undefined
24 snips
Jan 3, 2020 • 1h 4min

#6 A principled Bayesian workflow, with Michael Betancourt

The podcast discusses a principled Bayesian workflow with Michael Betancourt, highlighting the challenges of building models and the importance of questioning default settings. Michael shares insights on Bayesian vs. frequentist methods in inference, mastering the Bayesian workflow, diverse projects in the Stan team, and personal endeavors. The episode also covers custom model building, upcoming courses on advanced topics, and resources for Bayesian methods.
undefined
Dec 17, 2019 • 47min

#5 How to use Bayes in the biomedical industry, with Eric Ma

I have two questions for you: Are you a self-learner? Then how do you stay up to date? What should you focus on if you’re a beginner, or if you’re more advanced?And here is my second question: Are you working in biomedicine? And if you do, are you using Bayesian tools? Then how do you get your co-workers more used to posterior distributions than p-values? In other words, how do you change behaviors in a large organization?In this episode, Eric Ma will answer all these questions and even tell us his favorite modeling techniques, which problems he encountered with these models, and how he solved them. He’ll also share with us the software-engineering workflow he uses at Novartis to share his work with colleagues.Eric is a data scientist at the Novartis Institutes for Biomedical Research, where he focuses on Bayesian statistical methods to make medicines for patients. Eric is also a prolific open source developer: he led the development of pyjanitor, an API for cleaning data in Python, and nxviz, a visualization package for NetworkX. He also contributes to PyMC3, matplotlib and bokeh.This is « Learning Bayesian Statistics », episode 5, recorded October 21, 2019.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Eric's website: https://ericmjl.github.io/Eric on Twitter: https://twitter.com/ericmjlBayesian analysis recipes: https://github.com/ericmjl/bayesian-analysis-recipesBayesian deep learning demystified: https://github.com/ericmjl/bayesian-deep-learning-demystifiedCausality repo: https://github.com/ericmjl/causalityPyjanitor - Convenient data cleaning routines for repetitive tasks: https://pyjanitor.readthedocs.io/PyMC3 - Probabilistic Programming in Python: https://docs.pymc.io/Panel - A high-level app and dashboarding solution for Python: https://panel.pyviz.org/Nxviz - Visualization Package for NetworkX: https://nxviz.readthedocs.io/en/latest/

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app