

The Data Exchange with Ben Lorica
Ben Lorica
A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].
Episodes
Mentioned books

Oct 20, 2022 • 42min
A new storage engine for vectors
Ram Sriharsha is VP of Engineering and R&D at Pinecone, a startup that offers a fully managed vector database (not just an index). We discuss Pinecone’s new proprietary storage engine, which was first described around the time we recorded this conversation.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

Oct 13, 2022 • 42min
Project Lightspeed: Next-generation Spark Streaming
Karthik Ramasamy, is the Head of Streaming at Databricks. He has extensive experience in streaming, having led teams at Twitter (Apache Heron), Splunk, and Streamlio (Apache Pulsar).Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

Oct 6, 2022 • 35min
The Unreasonable Effectiveness of Speech Data
Piotr Żelasko is Head of Research at Meaning, a startup building an AI platform using speech technologies. He has years of experience in speech technologies, both as a researcher and as a software engineer. We recorded this episode on the week of the release of Whisper, deep learning model (from OpenAI) that approaches human level robustness and accuracy on English speech recognition. Our conversation centered on Whisper and speech recognition, but also touched on the new speech data processing tools (Lhotse, k2, Icefall) that we described in our recent post.Download a FREE copy of our recent 2022 Trends Report (Data, Machine Learning, AI): https://gradientflow.com/2022trendsreport/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

Sep 29, 2022 • 45min
Machine Learning Integrity
Yaron Singer is the CEO of Robust Intelligence, a company building tools to help manage and mitigate risks associated with machine learning models and applications. Download a FREE copy of our recent 2022 Trends Report (Data, Machine Learning, AI): https://gradientflow.com/2022trendsreport/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

Sep 22, 2022 • 39min
Synthetic data technologies can enable more capable and ethical AI
Yashar Behzadi is the CEO & Founder of Synthesis AI, a startup that uses synthetic data technologies to enable teams building AI applications, as well as gaming and metaverse applications.Download a FREE copy of our recent 2022 Trends Report (Data, Machine Learning, AI): https://gradientflow.com/2022trendsreport/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

Sep 15, 2022 • 36min
Confidential Computing for Machine Learning
Sadegh Riazi is CEO and co-founder of CipherMode Labs, a startup building tools that enable data and machine learning teams to build and deploy models directly on encrypted data. CipherMode’s new open source project enables teams to develop and deploy machine learning algorithms using familiar tools, and thus opens up the possibility of using sensitive data in different scenarios both within an organization, and in cooperation with other organizations.Download a FREE copy of our recent 2022 Trends Report (Data, Machine Learning, AI): https://gradientflow.com/2022trendsreport/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

Sep 8, 2022 • 42min
Applied NLP Research at Primer
John Bohannon is a Senior Director of Data Science and Head of Research at Primer AI, an end-to-end machine intelligence solution for textual data. We discussed their process of translating ML research into ML products, through the lens of several use cases.Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

Sep 1, 2022 • 31min
Using SQL to Retrieve Data from APIs and Web Services
Jon Udell is community lead for Steampipe, an open-source tool that populates a database table with data retrieved from APIs. They use Postgres, which means that data is easy to explore and retrieve using SQL. Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

Aug 25, 2022 • 40min
Machine Learning for Time Series Intelligence
Aadyot Bhatnagar, is a Senior Research Engineer at Salesforce, and co-creator of Merlion an open source framework for applying machine learning on time series data. Merlion supports a wide range of time series learning tasks including forecasting, anomaly detection, and change point detection. Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.

Aug 18, 2022 • 39min
Unleashing the power of large language models
Maarten Grootendorst, is a data scientist at IKNL, and more importantly, he’s the author of two open source libraries that I’ve come to love: BERTopic (topic modeling with transformers and c-TF-IDF) and PolyFuzz (fuzzy string matching). Both these projects bring the power of transformers and other leading edge models, and package them with simple APIs, clear documentation, and visualization tools.Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.Detailed show notes can be found on The Data Exchange web site.


