How AI Is Built

#002 AI Powered Data Transformation, Combining gen & trad AI, Semantic Validation

7 snips
Apr 12, 2024
Antonio Bustamante, a serial entrepreneur, talks about building bem.ai, a data tool for AI and software. Topics include challenges of integrating semi-structured data, using LLMs in data transformation, reliability of data infrastructure, and interoperability layers for systems.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Price Sheets Broken By Pumpkin Decorations

  • Antonio recounts parsing farm price sheets that broke whenever customers added holiday decorations or changed formatting.
  • He said they even asked users to stop adding pumpkins because formatting changes broke parsing.
INSIGHT

LLMs Are Part, Not The Whole

  • Antonio says LLMs are only ~30% of the solution and must be combined with validation and infrastructure.
  • He emphasizes building fault-tolerant systems that keep LLM strengths and discard their weaknesses.
INSIGHT

Preserve Geometry For LLM Inputs

  • Antonio stresses feeding LLMs documents that preserve geometry and semantic co-location, not just raw OCR text.
  • He argues LLMs need structured context (tables, spatial cues) to interpret complex documents accurately.
Get the Snipd Podcast app to discover more snips from this episode
Get the app