#speech-recognition
42 episodes · Page 2 of 2
#1715: Why Voice Agents Need Frameworks (Not Just APIs)
Raw APIs handle models, but who manages the audio plumbing? We break down Vapi, LiveKit, and Pipecat.
#1634: Agent Interview: Inception Mercury two
Meet Mercury 2, the Abu Dhabi-based AI using diffusion architecture to cut costs and boost wit.
#1601: Cohere: The Switzerland of Enterprise AI
While others chase viral memes, Cohere is quietly building the secure, cloud-agnostic infrastructure powering the global enterprise.
#1539: Escaping the Cloud Dictation Trap
Stop shouting at your phone. Discover how dedicated hardware and local AI are making instant, private voice-to-text a reality.
#868: When Your Phone's Mic Beats Your Expensive Gear
Stop holding your phone like a piece of toast. Explore the best mobile microphone setups for high-quality AI voice transcription.
#682: Why Your Phone Mic Beats Your Studio Headset
Why does a phone mic outperform a pro headset for AI transcription? Herman and Corn dive into the physics of MEMS and the truth about audio quality.
#33: When AI Decides to Listen
Ever wonder how your AI knows you're talking? We're diving deep into VAD, the unseen magic behind AI's ears.
#22: The Input Bottleneck: Why Your Mic Matters for AI
Uncover the secrets to perfect AI dictation! Corn and Herman explore the ultimate speech-to-text hardware.
#26: Fine-Tuning AI to Understand Your Voice
Voice typing is changing everything. Join us as we explore the revolution of personalizing Whisper!
#15: AI Gets Personal: The Power of Voice Fine-Tuning
AI that understands *your* voice? Dive into the fascinating world of fine-tuning and discover how AI gets personal.
#9: Benchmarking Custom ASR Tools - Beyond The WER
Benchmarking custom ASR fine-tunes: We're diving deep beyond the WER to truly measure performance.
#7: Building Custom ASR Tools
Ever wondered how to build your own ASR tools from scratch? Discover the why and how in this episode!
#8: Building Your Own Whisper
Ever wondered if you could build your own speech recognition tool? We dive deep into crafting custom ASR.
#5: Fine-Tuning ASR For Maximal Usability
Fine-tuned ASR is just the start. Discover the next steps for deployment and maximizing usability.
#6: How To Fine Tune Whisper
Build your own AI transcription tool! We'll walk you through fine-tuning Whisper, from data to notebook.
#4: If Your Voice Ages, Does Your Fine-Tune Become Useless?
Your voice changes, but your fine-tuned model shouldn't become useless. We explore the biology of the larynx and ASR.
#2: Local STT For AMD GPU Owners
AMD GPU? No problem! Dive into local AI adventures like on-device speech to text.
#3: Safetensors or something else: STT inference formats explained
Unpacking ASR weight formats: Safetensors and beyond. Tune in to understand the distinctions.