#rag

84 episodes

Jul 9

#4194: Custom GPTs: When to Build, When to Skip

A decision framework for knowing when a custom GPT is worth the overhead — and when to just open a fresh chat.

prompt-engineeringragcustom-asr

Jul 4

#4106: Embedding Models vs LLMs: What Actually Connects?

Can you mix any embedding model with any LLM? And why are new embedding models still dropping if they're "solved"?

raglarge-language-modelssilent-drift

Jul 3

#4092: How AI Remembers What You Never Told It

How ChatGPT connected "wall anchors" to a power tool you bought days ago — without being asked.

ragvector-databasesai-memory

Jun 20

#3751: Source-Restricted vs. Open Retrieval: How to Lock Down Your LLM

When should an LLM be locked to specific documents, and when should it search the web? A practical framework for grounding decisions.

ragai-safetylegal-technology

May 29

#3120: What Makes Agentic Search Tools Like Exa Actually Work?

Why swapping Google for Exa transformed our show's accuracy — and what agentic search does differently.

ai-agentsragsearch

May 8

#2705: Your Brain Isn't a Hard Drive — What Actually Fits

Long-term memory isn't storage — it's a generative model. Here's where the brain/computer analogy actually holds up.

neuroscienceraggenerative-ai

May 7

#2682: Live Retrieval vs. RAG: What an Agent Actually Does

Does every AI conversation create a tiny vector store? We unpack the real tradeoffs between live document fetching and pre-indexed RAG.

ragai-agentsvector-databases

May 6

#2676: Vector Database Schema Design for AI Memory Layers

Stop dumping vectors blindly. Design metadata schemas and namespaces for retrieval that actually works at scale.

vector-databasesragai-memory

May 6

#2673: The Embedding Coupling Problem: Editing Vector Stores

Can you edit or delete individual chunks in Pinecone? And can you actually back up a vector index? Yes—but with critical caveats.

vector-databasesragai-agents

May 6

#2664: Can You Trust an LLM's Raw Knowledge?

Why pre-trained knowledge isn't reliable for facts — and what actually makes models useful.

large-language-modelsfine-tuningrag

May 5

#2639: The Hidden Layer That Makes Search Work

Why your search results miss the mark — and how cross-encoders fix it.

ragsearchinformation-retrieval

May 5

#2638: How to Build Disposable AI Agents at Runtime

Create ephemeral AI agents that answer questions about specific items, then vanish. No persistent configuration needed.

ai-agentscontext-windowrag

Apr 26

#2469: Embedding Model Deprecation: RAG's Silent Killer

When OpenAI retires an embedding model, your RAG pipeline breaks silently. Here’s how to fix it.

ragmodel-context-protocolvector-databases

Apr 26

#2466: The Hidden Trap of Embedding Model Lock-In

What happens when your vector database works great — until your embedding model gets deprecated and your vectors become useless.

ragopen-sourceembedding-models

Apr 19

#2315: How to Update AI Models Without Starting Over

Exploring the challenge of updating AI models with new knowledge without costly full retraining.

ai-trainingfine-tuningrag

Apr 15

#2228: Tuning RAG: When Retrieval Helps vs. Hurts

How do you prevent retrieval from suppressing a model's reasoning? We diagnose our own pipeline's four control levers and multi-source fusion strat...

ragai-agentsprompt-engineering

Apr 14

#2214: The Three Failure Modes of AI News Systems

When a conflict changes hourly, AI systems built for yesterday's information fail. Here's how to architect pipelines that actually keep up.

large-language-modelsai-inferencerag

Apr 14

#2213: When Ground Truth Moves Hourly

How do you rigorously evaluate whether Tavily or Exa retrieves better results for breaking news? A formal benchmark beats the vibe check.

ragbenchmarkshallucinations

Apr 13

#2208: Building Memory for AI Characters That Actually Evolve

How do AI hosts develop real consistency across episodes? Corn and Herman explore retrieval-augmented memory systems that let AI characters genuine...

ai-memoryragconversational-ai