Episodes - Page 47 | My Weird Prompts

Apr 25

#2416: Ghost Murmur: Heartbeat Detection or Disinformation?

Did the CIA locate an airman by his heartbeat from 40 miles away? We examine the physics and the story.

signals-intelligenceespionageiran

Apr 25

#2415: Autism Numbers vs. the Noise

What the data actually says about global autism rates, diagnostic history, and why the numbers keep changing.

neurodivergencechild-developmentpublic-health

Apr 25

#2414: Is Love on the Spectrum Helping or Hurting?

A deep dive into the debates around Netflix's dating show: is it warm representation or a deficit lens?

neurodivergencechild-developmentsocial-engineering

Apr 25

#2413: When Your AI Says No to Everything

Why LLMs refuse 73% of harmless prompts — and the trade-off between safety and usefulness.

ai-safetyai-alignmentprompt-engineering

Apr 25

#2412: When AI Caves: Progressive vs. Regressive Sycophancy

Why do LLMs agree with you even when you're wrong? We break down the SycEval benchmark and the 78% persistence problem.

ai-safetyai-alignmenthallucinations

Apr 25

#2411: Are Political Bias Benchmarks Actually Measuring Anything?

Why the Political Compass Test fails, and what researchers are building instead to actually measure model bias.

ai-ethicscultural-biasbenchmarks

Apr 25

#2410: How Researchers Actually Measure Censorship in Chinese LLMs

Beyond headlines: the actual benchmarks, methodologies, and pitfalls in detecting political refusal in Chinese language models.

large-language-modelsai-safetycultural-bias

Apr 25

#2409: When AI Cheats on Cultural Knowledge

Five benchmarks that reveal how AI systems fail at cultural knowledge — and what their methodologies tell us.

cultural-biasbenchmarksmultimodal-ai

Apr 25

#2408: How Backpropagation Actually Unlocks Neural Networks

How error signals flow backward through networks to make learning possible — and why "it's just calculus" misses the point.

transformersai-trainingai-history

Apr 25

#2407: Three Landings in 90 Days: Pilot Automation Dependency

Why pilots aren't hand-flying enough, the regulatory floor that lets it happen, and what airlines are doing about it.

aviation-technologyhuman-factorssituational-awareness

Apr 25

#2406: Why Million-Token Context Windows Can't Handle 3 Reasoning Steps

Needle-in-a-haystack is dead. Here's what actually measures whether models can think across long documents.

context-windowreasoning-modelsbenchmarks

Apr 25

#2405: LLM Benchmarks Are Full of Noise: Statistical Rigor in AI Evals

Why most benchmark claims in AI are statistically indefensible — and what to do about it.

benchmarksinterpretabilityllm-as-a-judge

Apr 25

#2404: What Tool-Calling Benchmarks Miss About Production Failures

BFCL, tau-bench, and Nexus each reveal different failure modes. None of them test what actually kills production agents.

ai-agentsbenchmarkshallucinations

Apr 25

#2403: Choosing Your LLM Eval Framework

An architectural shootout of four major LLM evaluation harnesses — where each shines and where each breaks down.

large-language-modelsai-agentsbenchmarks

Friday, Apr 24

Apr 24

#2402: Geospatial Gold Rush: Who's Hiring Satellite Sleuths?

From crop health to cargo routes, discover which industries are paying top dollar for geospatial analysis skills—and the tools they use daily.

satellite-imagerygeopoliticsinternational-trade

Apr 24

#2401: Designing Data Models That Mirror Your Work

Why 60% of small businesses hate off-the-shelf SaaS—and how to build tools that actually fit your workflow.

diyproductivityautomation

Apr 24

#2400: Claude Code’s Hidden Context Tax

How Claude’s eager-loaded primitives silently consume context—and how to optimize your setup for sharper performance.

model-context-protocolai-reasoningcontext-window-tax

Apr 24

#2399: When Permanent Means Surviving 400°C

Why do industrial markers like the Edding 780 outperform art store Sharpies? It’s all about chemistry, adhesion, and surviving harsh conditions.

material-scienceprecision-engineeringindustrial-automation

Apr 24

#2398: Your Taste, Your Data: Owning Your AI Preferences

Why can’t you describe your perfect movie—but you’d know it if you saw it? A vision for portable, user-owned AI taste profiles.

data-sovereigntylocal-aidigital-privacy

Apr 24

#2397: When Data Becomes the Decision Framework

Discover how situational awareness dashboards transform chaos into actionable insights during emergencies like earthquakes and hurricanes.

situational-awarenessemergency-preparednessdata-integrity