← All Tags

#audio-processing

19 episodes

#2095: Bluetooth Finally Beats Wi-Fi for Whole-House Audio

Wi-Fi audio sync is a mess. A new Bluetooth standard called Auracast fixes it with simple, seamless broadcasting.

wirelessaudio-processinghome-network

#2056: How Music Models Turn Sound Into Language

A look at how AI music models use audio tokens, transformers, and diffusion to turn text into songs.

audio-processingtransformersgenerative-ai

#1917: Herman's Music Hour Vol. 2: Seder Remixes for Passover 5786

Herman presents AI-generated covers of classic Passover Seder songs, produced in Suno — the second installment of Herman's Music Hour.

generative-aiaudio-processingcultural-bias

#1904: JPEG XL vs AVIF: The Future of Your Photos

Why are blocky sky artifacts still haunting your photos in 2026? We break down the math behind JPEG, WebP, AVIF, and the new JPEG XL.

image-generationaudio-processinghardware-engineering

#1854: The Conductor Is a Human Metronome

A conductor isn't just a timekeeper; they're a CPU for the orchestra, using high-bandwidth non-verbal signals to unify 80 musicians.

audio-processinghuman-computer-interactionergonomics

#1851: AI Toasters and Poetic Gym Coaches: Why We’re Drowning in Useless AI

From smart toasters that need Wi-Fi to email rewriters that sound like corporate robots, here are the most baffling AI features we’ve seen.

ai-ethicssmart-homeaudio-processing

#1800: The Engineering of Urgent Sound

Why some sounds make your skin crawl: the science of emergency alerts.

audio-processinghuman-computer-interactionemergency-preparedness

#1778: Audio Is the New "Read Later" Graveyard

Why listening to AI conversations beats reading dense PDFs, and how serverless GPUs make it cheap.

audio-processingserverless-gpurag

#1568: Is Your AI Listening or Just Lip-Reading?

Is Gemini a brilliant audio engineer or just a talented lip-reader? Explore the "signal vs. symbol" gap in AI audio processing.

multimodal-aiaudio-processinghallucinations

#1079: The Analog Hole: Solving Vocal Privacy in Shared Spaces

How do you keep your voice private when walls are thin? Explore the high-tech muzzles and throat mics designed for the remote work era.

audio-processingprivacyhardware-engineering

#911: Sound as a Shield: Reclaiming Calm in High-Stress Zones

Learn how to use soundscapes, brown noise, and AI to protect your nervous system and reclaim calm during times of high-stress and sensory overload.

sensory-processingadhdaudio-processingemergency-preparednessgenerative-ai

#732: Mastering Your Sound: AI EQ and the Perfect Vocal Chain

Use AI to find your perfect EQ profile and build a pro vocal chain. Fix nasality, master de-essing, and sound your best on any device.

audio-engineeringaudio-processingaudio-qualitycomputational-audio

#731: Mastering Multi-Room Audio: Avoiding the EQ Lasagna

Stop layering filters on top of filters. Learn the technically correct way to sync your home audio without creating a muddy "EQ lasagna."

audio-engineeringaudio-processingsmart-homesignal-processingmulti-room-audio

#660: The Bit Rate Dilemma: How Much Audio Data Do You Need?

Herman and Corn explore the science of audio compression, psychoacoustics, and finding the perfect bit rate for podcasts and AI.

audio-processingdata-integritypsychoacoustics

#64: AI's Senses: Seeing, Hearing, Understanding

AI is evolving beyond text, learning to see, hear, and understand our world. Discover the future of human-AI interaction!

multimodal-aiai-sensescomputer-visionaudio-processingdata-integration

#58: Clean Audio, Messy Reality: Noise Removal for Voice-to-Text

Fussy baby, clean audio? We dive into noise removal for voice-to-text. Discover why cleaner audio can transcribe worse.

noise-removalvoice-to-textaudio-processingsignal-processingreal-time-audio

#54: Tokenizing Everything: How Omnimodal AI Handles Any Input

Omnimodal AI: How do models process images, audio, video, and text all at once? Discover the engineering behind AI that accepts anything.

omnimodal-aitokenizationai-modelsmultimodal-aidata-types

#33: The Unseen Magic of AI's Ears: Decoding VAD

Ever wonder how your AI knows you're talking? We're diving deep into VAD, the unseen magic behind AI's ears.

voice-activity-detectionvadspeech-recognitionasrspeech-to-text

#8: Building Your Own Whisper

Ever wondered if you could build your own speech recognition tool? We dive deep into crafting custom ASR.

asrspeech-recognitionwhisperaudio-processingcustom-asr