Speech & Audio

Audio Processing

Noise removal, audio quality, microphones, VAD

8 episodes

Who’s Talking? The Tech of Speaker Identification

Herman and Corn break down the difference between speaker diarization and identification to help automate meeting transcripts.

speaker-diarizationvoice-embeddingsspeaker-identification

Sonic Sorcery: Mapping Spatial Audio in Small Spaces

Discover how spatial audio and room mapping can turn a tiny rental bedroom into a cinematic powerhouse without drilling a single hole.

spatial-audioacoustic-telemetryroom-mapping

The Sound Spotlight: How Beamforming Redefines Audio

Discover how math and physics turn simple microphones into "sound spotlights" that can isolate a single voice in even the noisiest environments.

beamforming-technologymicrophone-arraysdigital-signal-processing

Designing the Voice-First Workspace: IKEA for AI Pros

Learn how to transform your home office into a high-performance voice-first workspace using acoustic hygiene and ergonomic IKEA furniture hacks.

voice-firstacoustic hygieneikeaworkspaceergonomics

Silencing the Siren: Real-Time AI Noise Reduction

How do phones remove sirens and crying babies in real time? Explore the neural networks and hardware making crystal-clear audio possible.

noise reductionaudio engineeringneural networksmobile devicesedge computing

Beyond the Headset: Pro Audio for AI Voice Control

Tired of headsets? Herman and Corn explore professional microphone setups for seamless, high-accuracy AI voice dictation from a distance.

voice dictationai accuracymicrophonesaudio qualitysignal-to-noise ratio

Clean Audio, Messy Reality: Noise Removal for Voice-to-Text

Fussy baby, clean audio? We dive into noise removal for voice-to-text. Discover why cleaner audio can transcribe worse.

noise removalvoice-to-textaudio processingsignal processingneural networks

The Unseen Magic of AI's Ears: Decoding VAD

Ever wonder how your AI knows you're talking? We're diving deep into VAD, the unseen magic behind AI's ears.

voice activity detectionVADspeech recognitionASRspeech-to-text