DataNation - Podcast for Data Engineers, Analysts and Scientists

Alex Merced Podcasts
undefined
7 snips
Mar 18, 2024 • 0sec

51 – Open Data Standards (Apache Iceberg, Apache Parquet, Apache Arrow, Apache Ibis, Apach Substrait)

Explore the benefits of open data standards like Apache Arrow and Apache Iceberg in the data space, optimizing data transfer efficiency with Apache Arrow Flight and ADBC, enhancing scan planning in data catalogs with Apache Iceberg spec and Apache Ibis, standardizing data frameworks and SQL query processing with Apache Substrate, and the value of standardized open data formats and systems for innovation and efficiency.
undefined
Feb 21, 2024 • 0sec

50 – Thinking about the flow of Streaming/Real-Time Data

Alex thinks on the development of Real-time data pipelines.
undefined
Feb 2, 2024 • 0sec

48 – Understanding how Lakehouse Table Formats are Implemented in your Favorite Tools

Alex Merced discusses how Lakehouse Table Formats like Apache Iceberg, Apache Hudi, and Delta Lake are implemented in favorite tools. The podcast explores Java libraries, file structures, metadata tables, and file slices. It also covers implementing formats in different languages, query performance, and the differences between Apache Iceberg, Hoodie, and Delta formats.
undefined
5 snips
Jan 21, 2024 • 0sec

47 – Understanding your cloud costs (Storage, Egress, Compute, Serverless, etc.)

Exploring cloud costs, distributed file systems, object storage, and tiered storage models. Understanding cost-effective cloud service models and navigating cloud costs. Emphasizing the importance of optimizing data handling for cost efficiency.
undefined
Jan 20, 2024 • 0sec

Bonus: New Youtube Channel, State of the Data Lakehouse

Find all my data resources below:https://bio.alexmerced.com/data Listen to the State of the Data Lakehouse Podcast Here:https://em360tech.com/podcast/dremio-state-data-lakehouse?utm_source=podcasts&utm_medium=podcast&utm_content=content&utm_campaign=alexmercedcontent&utm_term=iceberg+lakehouse+nessie
undefined
Jan 9, 2024 • 0sec

2024 Preview – Data/Web Content

youtube.com/@alexmercedcoder youtube.com/@alexmerceddata twitter.com/alexmercedcoder twitter.com/amdatalakehouse
undefined
Dec 8, 2023 • 0sec

46 – Apache Iceberg vs Delta Lake: Understanding the Table Format Debate

Exploring the differences between Apache Iceberg and Delta Lake, including their requirements and structure. The podcast also dives into the importance of open source projects with a focus on Apache Iceberg. Additionally, robust tools for AI and ML workflows, such as Drimeo, are discussed.
undefined
Nov 1, 2023 • 0sec

45 – BI Dashboard Acceleration (Extracts, Cubes and Reflections)

Alex Merced discusses different techniques to speed up BI Dashboard performance.
undefined
Oct 30, 2023 • 0sec

Call for Speakers – Subsurface 2024 (Live in NYC May 2024)

Submit your talks here: https://www.dremio.com/subsurface/
undefined
Oct 19, 2023 • 0sec

44 – Multi-Table Versioning and why Abstractions Matter

There is a reason the Git-for-Data Paradigm of Nessie catalogs is so essential, not only for the versioning features it provides but also the level of abstraction it provides them. In this episode, I discuss this more.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app