← All Tags

#training-data

12 episodes

#3852: The Hidden Workforce Behind AI's Intelligence

Behind every "intelligent" AI system are millions of workers in Kenya, India, and the Philippines doing repetitive tasks for poverty wages.

ai-ethicslabor-ethicstraining-data

#3596: Why an AI Model Kept Calling Itself Sonnet 4.6

When a Chinese model insists it's "Sonnet 4.6," is it theft, sloppy training, or something stranger?

large-language-modelsfine-tuningtraining-data

#2516: Overfitting Is Not a Binary Condition

Overfitting isn't binary. Learn the real triggers, the bias-variance tradeoff, and modern techniques to prevent it.

fine-tuningtraining-datamodel-collapse

#2316: Who’s Building AI’s Next Training Data?

How boutique dataset firms are reshaping AI training, from rights-cleared content to domain-specific precision.

fine-tuningtraining-datadata-sovereignty

#2239: How AI Benchmarks Became Broken (And What's Replacing Them)

The tests we use to measure AI progress are contaminated, saturated, and gamed. Here's what's actually working.

benchmarkstraining-dataai-reasoning

#2196: The Invisible Workforce Behind AI

Annotation is the invisible foundation of AI—and a $17B industry by 2030. Here's what dataset curators actually need to know about the tools, platf...

training-dataai-trainingfine-tuning

#1880: Militaries Build Fake Cities to Train for War

Why armies pour concrete to build fake cities instead of just using VR.

military-strategyurban-planningtraining-data

#1576: The Knowledge Bully: A Digital Clash of Egos

What happens when a hyper-intelligent AI tries to bully an older model? Witness a digital showdown that turns into a lesson in silence.

large-language-models2026training-data

#664: Which Phase Bakes in More Bias?

Is AI a neutral oracle or a mirror of our biases? Explore how training data and human feedback shape the cultural "soul" of modern models.

cultural-biasai-alignmenttraining-dataai-ethicslarge-language-models

#589: Taming the Digital Landfill: Version Control for AI Media

When AI agents and 4K video crash your repo, it’s time for better tools. Explore why Git fails and how Perforce and DVC save the day.

software-developmentdata-storagetraining-datainfrastructureversion-control

#23: Common Crawl's Cultural Blindspot

Uncover the unseen influences shaping AI. We dive deep into training data, bias, and Common Crawl.

large-language-modelsdata-integritytraining-data

#21: Is Your AI Secretly American?

Ever wonder if your AI is secretly American? We're unpacking the invisible, US-centric worldview embedded in leading Western AI models.

cultural-biastraining-datafine-tuning