

The Data Exchange with Ben Lorica
Ben Lorica
A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].
Episodes
Mentioned books

Dec 3, 2020 • 38min
Improving the robustness of natural language applications
In this episode of the Data Exchange I speak with Jack Morris, a member of Google’s AI Residency program. He is co-creator of TextAttack, an open source framework for adversarial attacks, data augmentation, and adversarial training in NLP (paper, code).Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

Nov 26, 2020 • 44min
End-to-end deep learning models for speech applications
In this episode of the Data Exchange I speak with Yishay Carmiel, an AI Leader at Avaya, a company focused on digital communications. He has long been immersed in speech technologies and conversational applications and I have frequently used him as a resource to understand the latest in speech systems. We previously co-wrote an article that listed out recommendations for teams building speech applications. We also had a previous conversation on the impact of deep learning and big data on speech technologies.Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

Nov 19, 2020 • 46min
Securing machine learning applications
In this episode of the Data Exchange I speak with Ram Shankar, a Berkman Klein Center affiliate, and a researcher and engineer who works at the intersection of Machine Learning and Security. This episode is focused on the current state of tools and techniques for securing machine learning applications.Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

Nov 12, 2020 • 30min
Testing Natural Language Models
In this episode of the Data Exchange I speak with Marco Ribeiro, Senior Researcher at Microsoft Research, and lead author of the award-winning paper ”Beyond Accuracy: Behavioral Testing of NLP models with CheckList”. As machine learning gains importance across many application domains and industries, there is a growing need to formalize how ML models get built, deployed, and used. MLOps is an emerging set of practices focused on productionizing the machine learning lifecycle, that draws ideas from CI/CD. But even before we talk about deploying a model to production, how do we inject more rigor into the model development process?Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

Nov 5, 2020 • 33min
Detecting Fake News
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.In this episode of the Data Exchange I speak with Xinyi Zhou, a graduate student in Computer and Information Science at Syracuse University. Xinyi and her advisor (Reza Zafarani) recently wrote a comprehensive survey paper entitled “A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities”. They set out to organize the many different methods and perspectives used to detect fake news. Their paper is a great resource for anyone wanting to understand the strengths and limitations of various state-of-the-art techniques, and a feel for where the research community might be headed in the near future.Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

Oct 29, 2020 • 43min
The Computational Limits of Deep Learning
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.In this episode of the Data Exchange I speak with Neil Thompson, Research Scientist at Computer Science and Artificial Intelligence Lab (CSAIL) and the Initiative on the Digital Economy, both at MIT. I wanted Neil on the podcast to discuss a recent paper he co-wrote entitled “The Computational Limits of Deep Learning” (summary version here). This paper provides estimates of the amount of computation, economic costs, and environmental impact that come with increasingly large and more accurate deep learning models. Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

Oct 22, 2020 • 47min
Making deep learning accessible
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.In this episode of the Data Exchange I speak with Piero Molino, creator of Ludwig, a toolbox that allows users to train and test deep learning models through a declarative interface. Piero created Ludwig while serving as a Senior Research Scientist at Uber AI. He originally created Ludwig for his personal use and it slowly garnered users within Uber. By the time it was open sourced in early 2019, the project immediately found a receptive audience in the conferences I was chairing at the time.Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

Oct 15, 2020 • 50min
Building and deploying knowledge graphs
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.In this episode of the Data Exchange I speak with Mayank Kejriwal, a Research Assistant Professor in the Department of Industrial and Systems Engineering, and a Research Lead at the USC Information Sciences Institute. The focus of our conversation is knowledge graphs, a collection of linked entities (objects, events, concepts) that is used in many AI applications. For example, Google uses a knowledge graph to enhance its search engine results with infoboxes that appear in some search results. Other areas where knowledge graphs are common include e-commerce, healthcare, and financial services.Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

Oct 8, 2020 • 37min
Financial Time Series Forecasting with Deep Learning
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.In this episode of the Data Exchange I speak with Murat Özbayoğlu, Chair of Artificial Intelligence Engineering at TOBB University of Economics and Technology in Ankara, Turkey. I wanted Murat on to discuss two survey papers he and his colleagues wrote on the use of deep learning in finance.I’ve long been fascinated with finance and trading. My first job after I left academia was as the lead quant in a hedge fund, and ever since, I’ve tried to stay abreast of what tools and techniques quants and data scientists in finance are using. Forecasting in this setting usually means price prediction or price movement (trend) prediction. Output of forecasting models are used to inform investment decisions. What makes finance particularly challenging is that many people are using the same underlying data (time series of prices/values), and thus as Murat notes, many firms use alternative data sources (such as text) as potential sources of additional signal.Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.

Oct 1, 2020 • 50min
A programming language for scientific machine learning and differentiable programming
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.In this episode of the Data Exchange I speak with Viral Shah, co-founder and CEO, Julia Computing. Along with his Julia language co-creators, Viral was awarded the 2019 Wilkinson prize, for outstanding contributions in the field of numerical software. I first tweeted about Julia at the beginning of March 2012 after seeing Jeff Bezanson give a talk in Stanford. I’ve dabbled with it here and there, but have never used it for a major project. Over the past few years, Julia continued to add packages at a steady pace and the package manager is really quite impressive and solid. We spent much of the podcast discussing the state of Julia, Julia 1.5, and the Julia ecosystem and community.Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.Detailed show notes can be found on The Data Exchange web site.Subscribe to The Gradient Flow Newsletter.


