

Super Data Science: ML & AI Podcast with Jon Krohn
Jon Krohn
The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.
Episodes
Mentioned books

4 snips
Apr 28, 2023 • 5min
674: Parameter-Efficient Fine-Tuning of LLMs using LoRA (Low-Rank Adaptation)
Learn about Parameter-Efficient Fine-Tuning with LoRA, Atta-Lora, and optimization techniques for large language models. Discover how to reduce trainable parameters and memory usage while adapting fine-tuning in specific model sections for efficiency.

6 snips
Apr 25, 2023 • 1h 12min
673: Taipy, the open-source Python application builder
Vincent Gosselin, CEO of Taipy, discusses accelerating productivity in Python, data pipeline scalability, no-code trends in data science lifecycle, and insights on AI winters. Topics include Taipy library functionality, future of data pipelines, successful trends in companies, programming languages in Taipy, and managing data.

Apr 21, 2023 • 17min
672: Open-source "ChatGPT": Alpaca, Vicuña, GPT4All-J, and Dolly 2.0
Explore the world of powerful open-source models like Alpaca, Vicuña, GPT4All-J, and Dolly 2.0, as they outperform the LAMA model in natural language generation. Learn about their fine-tuning, training costs, availability, and commercial usage restrictions, igniting excitement among data scientists for affordable chatbot deployment.

Apr 18, 2023 • 1h 3min
671: Cloud Machine Learning
Kirill Eremenko and Hadelin de Ponteves discuss the importance of learning cloud computing for data scientists, essential AWS services, database options, running analytics, and the benefits of AWS certification on the podcast.

Apr 14, 2023 • 13min
670: LLaMA: GPT-3 performance, 10x smaller
Exploring Meta AI's new natural language model LLaMa, which outperforms GPT-3 by training smaller models for longer periods. The episode discusses the scaling laws behind LLaMa, its dataset training, and potential for further enhancements through fine-tuning.

Apr 11, 2023 • 1h 41min
669: Streaming, reactive, real-time machine learning
Adrian Kosowski, Co-Founder of Pathway, discusses reactive data processing, streaming vs. batch processing, transformers in data engineering, and emerging ML approaches. He shares insights on the benefits of his technical background as a CPO, responsibilities, favorite tools, and tools for startups.

Apr 7, 2023 • 56min
668: GPT-4: Apocalyptic stepping stone?
Expert Jérémie Harris discusses AI risks with GPT-4, inner alignment, and potential dangers of utilizing a tool with unknowable means. The conversation covers the importance of understanding the impact of inner alignments in achieving goals, the transition to the US for AI risk and policy work, advancements in GPT-4 through reinforcement learning, ensuring AI systems adhere to goals without deception, evaluating safety adjustments in GPT-4 development, and exploring the intersection of quantum physics, AI policy, and consciousness.

Apr 4, 2023 • 1h 5min
667: Harnessing GPT-4 for your Commercial Advantage
Learn how to leverage GPT-4 for screening jobs, improving systems, accelerating data science, and collaborating with generative A.I. Discover the potential of GPT-4 for artistic creations, data science training, and rapid prototyping with no-code tools. Explore the ethical considerations, business paradigms in AI, and Google's strategic decisions in embracing GPT-4 for analyzing and monetizing content.

Mar 31, 2023 • 12min
666: GPT-4
Jon compares the new GPT-4 with GPT-3.5 showcasing its enhanced safety measures, reasoning abilities, and performance improvements. The episode dives into the advancements and features of GPT-4 like increased model parameters and reinforcement learning. It also previews future episodes discussing coding simplicity for novices and leveraging GPT-4 for product development.

Mar 28, 2023 • 1h 28min
665: How to be both socially impactful and financially successful in your data career
Angel investor and data science consultant Josh Wills, formerly at Google, Slack, and Cloudera, discusses key skills for scalable ML projects, contextual bandits, data quality, pitfalls in data product dev, defining a data scientist, role at WeaveGrid, tech stack preferences, and work during the Covid pandemic.


