#multimodal-ai
7 episodes
#786: Mastering the Hoard: AI-Powered Inventory Management
Learn how to manage thousands of parts without losing your mind using AI, QR codes, and professional logistics strategies.
#749: Breaking the Fourth Wall: Moving to Real-Time AI Audio
Can AI podcasts move from polished scripts to raw, real-time conversation? Explore the technical and financial shift to live multimodal models.
#132: Beyond Frames: The Rise of Real-Time Video AI
Discover how spatial-temporal tokenization and 3D world modeling are revolutionizing real-time video-to-video AI interaction.
#64: AI's Senses: Seeing, Hearing, Understanding
AI is evolving beyond text, learning to see, hear, and understand our world. Discover the future of human-AI interaction!
#54: Tokenizing Everything: How Omnimodal AI Handles Any Input
Omnimodal AI: How do models process images, audio, video, and text all at once? Discover the engineering behind AI that accepts anything.
#53: Instructional vs. Conversational AI: The Distinction Nobody Talks About
Instructional vs. conversational AI: a crucial distinction reshaping how AI is built. Discover why it matters for the future of AI development.
#46: Pixels, Prompts & Pseudo-Text: AI's Word Problem
AI paints stunning images, but can't spell "cat." Why do advanced models struggle with simple text? Dive into AI's weird word problem!