How AI Is Built

#010 Building Robust AI and Data Systems, Data Architecture, Data Quality, Data Storage

22 snips
May 31, 2024
Data architect Anjan Banerjee discusses building complex AI and data systems, explaining data architecture with Lego analogies. Topics include selecting data tools, using Airflow for orchestration, incorporating AI for data processing, and analyzing Snowflake vs. Databricks solutions. The podcast also covers automating data integration for comprehensive customer views.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Begin Design With Source And Use Case

  • Start by identifying your source systems and their update frequency before choosing tools.
  • Select storage and formats based on use case: block for images, time-series DB for sensor data, open table formats for ML.
ADVICE

Match Tools To Team Skills

  • Prefer specialized tools for each task but match them to your team's skills and scale.
  • Use UI-based ETL for less technical teams and Airflow/dbt for teams with Python and data engineering expertise.
ADVICE

Choose Native Serverless Or Agnostic SaaS

  • Use cloud-native serverless orchestrators (Step Functions, Cloud Run) when tied to a single cloud for simpler scaling.
  • For multi-cloud, choose platform-agnostic SaaS (Astronomer, Prefect, Airbyte) to avoid lock-in.
Get the Snipd Podcast app to discover more snips from this episode
Get the app