Data Science at Home

Francesco Gadaleta
undefined
Aug 28, 2018 • 16min

Episode 45: why do machine learning models fail?

The success of a machine learning model depends on several factors and events. True generalization to data that the model has never seen before is more a chimera than a reality. But under specific conditions a well trained machine learning model can generalize well and perform with testing accuracy that is similar to the one performed during training. In this episode I explain when and why machine learning models fail from training to testing datasets.
undefined
Aug 21, 2018 • 21min

Episode 44: The predictive power of metadata

In this episode I don't talk about data. In fact, I talk about metadata. While many machine learning models rely on certain amounts of data eg. text, images, audio and video, it has been proved how powerful is the signal carried by metadata, that is all data that is invisible to the end user. Behind a tweet of 140 characters there are more than 140 fields of data that draw a much more detailed profile of the sender and the content she is producing... without ever considering the tweet itself.   References You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information https://www.ucl.ac.uk/~ucfamus/papers/icwsm18.pdf
undefined
Aug 14, 2018 • 37min

Episode 43: Applied Text Analysis with Python (interview with Rebecca Bilbro)

Today’s episode is about text analysis with python. Python is the de facto standard in machine learning. A large community, a generous choice in the set of libraries, at the price of less performant tasks, sometimes. But overall a decent language for typical data science tasks. I am with Rebecca Bilbro, co-author of Applied Text Analysis with Python, with Benjamin Bengfort and Tony Ojeda. We speak about the evolution of applied text analysis, tools and pipelines, chatbots.  
undefined
Aug 7, 2018 • 29min

Episode 42: Attacking deep learning models (rebroadcast)

Attacking deep learning models Compromising AI for fun and profit   Deep learning models have shown very promising results in computer vision and sound recognition. As more and more deep learning based systems get integrated in disparate domains, they will keep affecting the life of people. Autonomous vehicles, medical imaging and banking applications, surveillance cameras and drones, digital assistants, are only a few real applications where deep learning plays a fundamental role. A malfunction in any of these applications will affect the quality of such integrated systems and compromise the security of the individuals who directly or indirectly use them. In this episode, we explain how machine learning models can be attacked and what we can do to protect intelligent systems from being  compromised.
undefined
Jul 31, 2018 • 18min

Episode 41: How can deep neural networks reason

Today’s episode  will be about deep learning and reasoning. There has been a lot of discussion about the effectiveness of deep learning models and their capability to generalize, not only across domains but also on data that such models have never seen. But there is a research group from the Department of Computer Science, Duke University that seems to be on something with deep learning and interpretability in computer vision.   References Prediction Analysis Lab Duke University https://users.cs.duke.edu/~cynthia/lab.html This looks like that: deep learning for interpretable image recognition https://arxiv.org/abs/1806.10574
undefined
Jul 24, 2018 • 17min

Episode 40: Deep learning and image compression

Today’s episode  will be about deep learning and compression of data, and in particular compressing images. We all know how important compressing data is, reducing the size of digital objects without affecting the quality. As a very general rule, the more one compresses an image the lower the quality, due to a number of factors like bitrate, quantization error, etcetera. I am glad to be here with Tong Chen,  researcher at the School of electronic Science and Engineering of Nanjing University, China. Tong developed a deep learning based compression algorithm for images, that seems to improve over state of the art approaches like BPG, JPEG2000 and JPEG.   Reference Deep Image Compression via End-to-End Learning - Haojie Liu, Tong Chen, Qiu Shen, Tao Yue, and Zhan Ma School of Electronic Science and Engineering, Nanjing University, Jiangsu, China  
undefined
Jul 19, 2018 • 22min

Episode 39: What is L1-norm and L2-norm?

In this episode I explain the differences between L1 and L2 regularization that you can find in function minimization in basically any machine learning model.  
undefined
Jul 17, 2018 • 47min

Episode 38: Collective intelligence (Part 2)

In the second part of this episode I am interviewing Johannes Castner from CollectiWise, a platform for collective intelligence. I am moving the conversation towards the more practical aspects of the project, asking about the centralised AGI and blockchain components that are essential part of the platform.   References Opencog.org Thaler, Richard H., Sunstein, Cass R. and Balz, John P. (April 2, 2010). "Choice Architecture". doi:10.2139/ssrn.1583509. SSRN 1583509  Teschner, F., Rothschild, D. & Gimpel, H. Group Decis Negot (2017) 26: 953. https://doi.org/10.1007/s10726-017-9531-0 Firas Khatib, Frank DiMaio, Foldit Contenders Group, Foldit Void Crushers Group, Seth Cooper, Maciej Kazmierczyk, Miroslaw Gilski, Szymon Krzywda, Helena Zabranska, Iva Pichova, James Thompson, Zoran Popović, Mariusz Jaskolski & David Baker, Crystal structure of a monomeric retroviral protease solved by protein folding game players, Nature Structural & Molecular Biology volume18, pages1175–1177 (2011) Rosenthal, Franz; Dawood, Nessim Yosef David (1969). The Muqaddimah : an introduction to history ; in three volumes. 1. Princeton University Press. ISBN 0-691-01754-9. Kevin J. Boudreau and Karim R. Lakhani, Using the Crowd as an Innovation Partner, April 2013. Sam Bowles, The Moral Economy: Why Good Incentives are No Substitute for Good Citizens. Amartya K. Sen, Rational Fools: A Critique of the Behavioral Foundations of Economic Theory, Philosophy & Public Affairs, Vol. 6, No. 4 (Summer, 1977), pp. 317-344, Published by: Wiley, Stable URL: http://www.jstor.org/stable/2264946
undefined
Jul 12, 2018 • 31min

Episode 38: Collective intelligence (Part 1)

This is the first part of the amazing episode with Johannes Castner, CEO and founder of CollectiWise. Johannes is finishing his PhD in Sustainable Development from Columbia University in New York City, and he is building a platform for collective intelligence. Today we talk about artificial general intelligence and wisdom. All references and shownotes will be published after the next episode. Enjoy and stay tuned!
undefined
Jul 9, 2018 • 26min

Episode 37: Predicting the weather with deep learning

Predicting the weather is one of the most challenging tasks in machine learning due to the fact that physical phenomena are dynamic and riche of events. Moreover, most of traditional approaches to climate forecast are computationally prohibitive. It seems that a joint research between the Earth System Science at the University of California, Irvine and the faculty of Physics at LMU Munich has an interesting improvement on the scalability and accuracy of climate predictive modeling. The solution is... superparameterization and deep learning.   References                   Could Machine Learning Break the Convection Parameterization Deadlock?                                Gentine, M. Pritchard, S. Rasp, G. Reinaudi, and G. Yacalis Earth and Environmental Engineering, Columbia University, New York, NY, USA, Earth System Science, University of California, Irvine, CA, USA, Faculty of Physics, LMU Munich, Munich, Germany            

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app