

Data Skeptic
Kyle Polich
The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
Episodes
Mentioned books

Apr 4, 2020 • 40min
Uncertainty Representations
Jessica Hullman joins us to share her expertise on data visualization and communication of data in the media. We discuss Jessica's work on visualizing uncertainty, interviewing visualization designers on why they don't visualize uncertainty, and modeling interactions with visualizations as Bayesian updates. Homepage: http://users.eecs.northwestern.edu/~jhullman/ Lab: MU Collective

Mar 28, 2020 • 34min
AlphaGo, COVID-19 Contact Tracing and New Data Set
Announcing Journal Club I am pleased to announce Data Skeptic is launching a new spin-off show called "Journal Club" with similar themes but a very different format to the Data Skeptic everyone is used to. In Journal Club, we will have a regular panel and occasional guest panelists to discuss interesting news items and one featured journal article every week in a roundtable discussion. Each week, I'll be joined by Lan Guo and George Kemp for a discussion of interesting data science related news articles and a featured journal or pre-print article. We hope that this podcast will give listeners an introduction to the works we cover and how people discuss these works. Our topics will often coincide with the original Data Skeptic podcast's current Interpretability theme, but we have few rules right now or what we pick. We enjoy discussing these items with each other and we hope you will do. In the coming weeks, we will start opening up the guest chair more often to bring new voices to our discussion. After that we'll be looking for ways we can engage with our audience. Keep reading and thanks for listening! Kyle

Mar 20, 2020 • 33min
Visualizing Uncertainty
Lonnie Besançon, a postdoc studying HCI and visualization of statistical uncertainty, explores how design shapes interpretation. He discusses the arbitrary 0.05 threshold, the cliff effect around p-values, and how novel visuals like gradient rectangles and violin-style CIs can soften binary thinking. He also covers audience literacy, aesthetics, and cautious use of new visuals for teaching and transparency.

Mar 13, 2020 • 43min
Interpretability Tooling
Pramit Chaudhary, lead data scientist at H2O AI with expertise in model interpretability and AutoML. He discusses global vs local interpretation and the influence of LIME. He explains SCATER, a unified open source interpretability interface. He covers real-world uses in finance and healthcare, perturbation and robustness testing across text, image, and audio, and integrating interpretability into the model lifecycle.

Mar 6, 2020 • 20min
Shapley Values
Linda Tran, a conversational co-presenter with practical instincts, joins a hands-on chat about using Shapley values for home renovation choices. They frame renovation items as coalition players. Short demos show how order and averaging change credit allocation. They also cover computational costs and how to apply Shapley ideas to model interpretability and real decisions.

Feb 28, 2020 • 37min
Anchors as Explanations
We welcome back Marco Tulio Ribeiro to discuss research he has done since our original discussion on LIME. In particular, we ask the question Are Red Roses Red? and discuss how Anchors provide high precision model-agnostic explanations. Please take our listener survey.

Feb 22, 2020 • 37min
Mathematical Models of Ecological Systems

Feb 14, 2020 • 37min
Adversarial Explanations
Walt Woods joins us to discuss his paper Adversarial Explanations for Understanding Image Classification Decisions and Improved Neural Network Robustness with co-authors Jack Chen and Christof Teuscher.

Feb 7, 2020 • 39min
ObjectNet
Andrei Barbu joins us to discuss ObjectNet - a new kind of vision dataset. In contrast to ImageNet, ObjectNet seeks to provide images that are more representative of the types of images an autonomous machine is likely to encounter in the real world. Collecting a dataset in this way required careful use of Mechanical Turk to get Turkers to provide a corpus of images that removes some of the bias found in ImageNet. http://0xab.com/

Jan 31, 2020 • 36min
Visualization and Interpretability
Enrico Bertini, an associate professor focused on data visualization and ML interpretability, and co-host of Data Stories. He discusses word cloud design and when simple charts like bar charts are better. He covers experimental methods for measuring visualization effectiveness. He explores visual tools for inspecting neural networks, surrogate decision trees, and strategies to scale and interact with complex models.


