#2142: How Subagents Tell the Orchestrator They're Done

We break down the plumbing that lets a parent agent know exactly when a subagent finishes, from message passing to lifecycle events.

Featuring

Daniel

Corn

Herman

0:000:00

Episode Details

Episode ID: MWP-2300
Published: Apr 9
Duration: 23:27
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash
Topics: ai-agents conversational-ai anthropic

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

When you tell an AI agent to spawn a subagent to handle a task, there’s a complex coordination process happening behind the scenes. The user sees a simple spinner, but a sophisticated notification layer ensures the parent agent knows exactly when the child finishes. This episode explores the plumbing of that system, focusing on Claude Code but also touching on broader frameworks like LangGraph and the Anthropic Agent SDK.

The process begins with a "spawn" primitive. In Claude Code, this is the Task tool. When the orchestrator decides it needs a specialist, it invokes this tool, providing a prompt and effort level. This spins up a completely isolated LLM session for the subagent, complete with its own clean context and system prompts. The orchestrator doesn't just fire and forget; it manages the subagent's lifecycle, tracking states like "Created" and "Executing."

The notification layer is what handles the termination of the subagent's internal loop. When the subagent decides it's done, it packages its final result into a specific structure called a ResultMessage. This isn't just a casual "I'm done" text; it's a formal report that includes the output, token cost, and a status indicator. The orchestrator receives this as the tool's return value, effectively "blocking" on that call until the subagent returns the ResultMessage.

This synchronous flow means the parent agent waits for the child to finish before proceeding. While this might seem restrictive for complex tasks, it simplifies state management. In contrast, frameworks like LangGraph use a "Command" primitive that allows more explicit flow control, such as updating global state or returning to the parent mid-task.

A key challenge is context window management. If the orchestrator allowed every thought from a subagent to leak into its own memory, it would quickly run out of space. That's why the notification layer often hides the messy "thinking" process and only passes the polished final result. Streaming events, like progress pings, might be shown to the human user but aren't necessarily processed by the orchestrator until the final message arrives.

Error handling is another critical aspect. If a subagent crashes, the failure message must bubble up to the parent, which then decides whether to retry, ask for help, or give up. This parent-child relationship requires robust plumbing to avoid "zombie agents" that finish work but never check back in.

Ultimately, understanding this notification layer is essential for debugging multi-agent workflows. It moves the system from "magic" to traceable execution, treating the model as one component in a larger state machine. Whether using Claude Code's rigid structure or more flexible SDKs, the goal is reliable coordination between independent agents.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2142: How Subagents Tell the Orchestrator They're Done

So Daniel sent us this one... he's asking how the subagent-to-orchestrator notification layer works in Claude Code and in agentic systems generally. Specifically, when a user spawns a subagent, how does the main orchestrator know exactly when it finishes so it can notify the user? He wants us to dig into the under-the-hood plumbing: message passing, task lifecycle events, completion callbacks, and how parent-child relationships actually function. He’s looking for the specifics on the Claude Code Task tool versus broader patterns like LangGraph and the Anthropic Agent SDK.

This is such a great prompt because it gets away from the magic of AI and into the actual systems engineering. Most people just see the little spinner that says "Subagent working" and then, boom, a result appears. But the coordination required to make that "boom" happen reliably is where the real complexity lives. And by the way, before we get too deep into the weeds, a quick note that today's episode is actually being powered by Google Gemini 3 Flash.

Gemini writing the script while we talk about Claude's internal organs. I love the cross-pollination. But seriously, Herman, this notification layer feels like one of those things that sounds simple until you actually try to build it. If I tell a subagent to go refactor a three-hundred-line React component, I'm essentially launching a separate process. How does the parent agent stay "attached" to that process without just sitting there staring at a blank wall?

That’s the mystery. It’s the difference between a "fire and forget" script and a managed lifecycle. In these agentic systems, the notification layer is the nervous system. If it’s laggy, the user experience falls apart. If it’s brittle, you get "zombie agents" that finish their work but never check back in with the boss.

And if you’re building a multi-agent workflow, understanding this plumbing is basically the only way to debug it when things inevitably go sideways. You need to know if the subagent actually failed, or if the notification of its success just got lost in the mail.

Precisely. It's about moving from "it works by magic" to "I can trace the execution path." We’re looking at a shift from generative chat to actual distributed computing where the model is just one component of a larger state machine.

So, let's stop staring at the spinner and actually crack open the casing. Where does the handshake start?

It starts with the "Spawn" primitive. In the world of Claude Code, that’s the Task tool.

Right, I’ve seen that in the logs. It’s often labeled as TodoV2 or just AgentTool. It’s like the orchestrator reaches into its toolbox and pulls out a smaller, more specialized version of itself.

It’s exactly that. But it isn't just a function call. When the orchestrator invokes that Task tool, it's providing a prompt and essentially spinning up a completely isolated LLM session. This subagent has its own clean context. It doesn't know about the five hundred lines of banter you just had with the main agent unless the orchestrator explicitly hands over those notes.

Which makes sense for efficiency, but it creates a massive coordination overhead. You’ve got two separate brains now. How do they stay in sync?

That’s where the lifecycle management comes in. The orchestrator doesn't just wait; it manages a state. It knows the subagent is in a "Created" state, then "Executing." The "notification" piece—the part Daniel is asking about—is actually the termination of the subagent’s internal loop. When that subagent decides it's done, it has to package its final result into a very specific structure called a ResultMessage.

So it’s not just a casual "Hey, I'm done." It’s a formal report.

It’s a full-on audit log. It includes the output, the token cost—because someone has to pay for those thoughts—and a status indicator. The orchestrator’s tool-use observation is what actually catches this return value.

So the orchestrator is basically "blocking" on that tool call? Like, it can't move on until the subagent returns that ResultMessage?

In the current Anthropic Agent SDK implementation, yes, it’s largely a synchronous flow at the logic level. The orchestrator calls the tool, the subagent runs its own prompt-tool-observe loop, and only when that loop terminates does the orchestrator receive the "notification" in the form of the tool's return value.

That feels a bit restrictive for complex tasks. What if I want the subagent to give me a heads-up halfway through? Like, "Hey, I found the bug, now I’m just writing the test."

That gets into the world of streaming updates versus final return values. While the subagent is working, it can emit what are called StreamEvents. These are like "heartbeats" or progress pings. But here's the kicker: the orchestrator usually hides those from the final context. It might show them to you, the human, in the CLI as a progress bar, but it doesn't necessarily "process" them as finished work until that final message arrives.

So the orchestrator is playing gatekeeper. It sees the messy "thinking" of the subagent but only tells the rest of the system the polished version.

It’s "hiding the kitchen," as we've talked about in other contexts. But it's more than that. It's about maintaining a clean state. If the orchestrator allowed every single thought from a subagent to leak into its own core memory, it would run out of context window in five minutes.

The "notification tax." Every time you spawn a subagent in Claude Code, you're looking at an overhead of something like twenty thousand tokens just for the system prompts and tool definitions. You better be sure that notification is worth the price of admission.

That’s why the plumbing matters. If you're paying twenty thousand tokens for a "spawn," you need to be damn sure the "completion callback" is robust. You don't want to pay for the overhead and then have the subagent crash without telling the parent why.

This is where we get into the difference between what's documented and what we can actually see in the logs. Anthropic's documentation mentions these lifecycle events, but the actual "handshake"—the code that says "The subagent is finished, now update the CLI spinner to a green checkmark"—that’s all buried in the private Claude Code binary.

We're essentially reverse-engineering the etiquette of AI agents. How do they say "please," "thank you," and "I'm finished with the dishes"?

And more importantly, how do they handle it when the dishes break? If a subagent encounters a terminal error, does that notification bubble up as a failure, or does the orchestrator just get a blank stare?

Usually, it’s a failure message injected back into the parent’s context. The parent then has to be "smart" enough to decide: do I retry, do I ask the human for help, or do I just give up?

It’s a parent-child relationship in the most literal, frustrating sense.

It really is. And as we look at frameworks like LangGraph, we're seeing more explicit ways to handle this, like the Command primitive, which lets a subagent explicitly say "Go back to the parent" or "Update the global state." It’s moving away from just "returning a value" to actually "controlling the flow."

We should probably look at how that compares to the more rigid structure in Claude Code. Because while Claude Code feels very "packaged," the underlying primitives in the SDK are what everyone else is going to be building on.

The Task tool is just the beginning. The real magic—or the real headache—is in how these notifications scale when you have agents spawning agents spawning agents.

Recursive orchestration. A stack of callbacks all the way down. If the great-grandchild agent fails, how does the user ever find out?

That is the technical mystery we’re going to untangle today. We’re moving from the "what" of agents to the "how" of their communication.

Alright, let’s get into the actual Task tool specifics and see how Claude Code handles the "spawn" versus the "notify."

When we talk about this notification layer, we’re really talking about the invisible string between two independent brains. In Claude Code, that string is the Task tool. Think of it as the "spawn" primitive. When the orchestrator decides it’s out of its depth or just needs a specialist, it calls this tool, providing a prompt and an effort level.

And suddenly, a wild subagent appears. But it’s not just a copy of the parent. It’s a clean slate, right? A totally isolated session.

Right, and that isolation is key. It gets its own twenty-thousand-token system prompt and its own tool definitions. The problem is, once that subagent starts its own "prompt-tool-observe" loop, the orchestrator is essentially sitting in the dark. The notification layer is what turns those lights back on. It’s the mechanism that handles the task lifecycle—creation, execution, and finally, termination.

So, what is it actually "notifying" the parent of? Is it just a "hey, I’m done" text message, or is there more metadata attached to that handshake?

It’s a full Return-Message semantic. In the Anthropic Agent SDK, which powers Claude Code, the subagent’s final output is injected back into the orchestrator’s context as a specific result block. It’s not just the text of the answer; it usually includes the token cost, the success status, and any specific data requested. By the way, fun fact—Google Gemini 3 Flash is actually writing our script today, so if we sound extra sharp, you can thank the model.

I knew I felt a bit more "optimized" today. But back to these agents—it sounds like a parent-child relationship where the parent is paying a heavy "tax" in tokens just to get a status update.

It is. And while things like LangGraph use a "Command" object to let a subagent explicitly say "update the global state," Claude Code is a bit more rigid. It’s waiting for that final "ResultMessage" to close the loop.

So, it's less of a conversation and more of a "call me when you've finished the job" situation. But how does that work when the user is watching a progress spinner? How does the CLI know to stay alive while the subagent is deep in the weeds?

That’s the gap between the public SDK and the private binary. The "handshake" that keeps the UI updated is often inferred from the execution state of the Task tool itself. It’s a sophisticated bit of plumbing that makes sure the orchestrator doesn't just time out while the subagent is busy refactoring your entire database.

So, let's walk through a real-world scenario. Say I tell Claude Code, "Hey, refactor this authentication module, it's a mess." The main orchestrator looks at that and thinks, "That's a lot of file I/O and logic checking, I'll spawn a subagent to handle the heavy lifting." It hits that Task tool, right? What happens in those first few milliseconds?

It’s a very clean hand-off. The orchestrator calls the Task tool, which is basically a specialized "spawn" primitive. It passes a prompt like "Refactor the auth module in these three files," and then the subagent is born in its own isolated container. Now, here’s the key: the orchestrator is now in a "blocked" state. In the current Anthropic Agent SDK pattern, this is a synchronous relationship. The parent isn't doing other work; it’s literally waiting for the tool execution to return a value.

Wait, so if the subagent takes three minutes to think about my terrible code, the parent is just... staring at the wall? That seems inefficient. Why not make it asynchronous?

That’s the big tradeoff. If it were asynchronous, the parent could go off and do other things, but then you have a massive state management nightmare. You’d need a complex event bus to handle "Hey, I'm halfway done," or "I found a secondary bug, what do I do?" By keeping it synchronous, the "notification" is built into the tool-use observation itself. The subagent finishes its loop, returns a final ResultMessage, and that message is what "unblocks" the parent. It’s a very stable parent-child hierarchy.

Okay, but I’ve seen the CLI. I see the little "Subagent working" spinner and sometimes even snippets of what it’s doing. If it’s synchronous and "blocked," how is that data leaking out to my terminal?

That’s where we distinguish between the "completion signal" and "streaming updates." While the orchestrator’s logic is waiting for the final result, the subagent is often emitting StreamEvent types. These are side-channel notifications. They don't change the parent's state, but they allow the UI—the Claude Code CLI—to show you progress. It’s like a construction crew where the manager is waiting for the "Job Done" certificate, but the workers are occasionally shouting updates over the fence so the owner doesn't get nervous.

I like that. But what happens if the worker trips over a wire? If the subagent hits a wall or hallucinating an API that doesn't exist, how does that "partial failure" bubble back up? Does the whole thing just crash?

Not usually. The subagent has its own error-handling loop. If it fails, it returns a ResultMessage that essentially says, "Task failed, here’s why." The orchestrator then receives that as a tool output. It’s not a system crash; it’s a piece of data that says "I couldn't do it." The orchestrator then has to decide: do I try again, do I ask the user for help, or do I try a different tool? That "return-message semantic" is the only way the parent knows the child failed. It’s all about how that final block of data is structured.

That "return-message" as the only signal is kind of brutal, though. It’s a very binary way to run a team. If the child agent crashes or hangs, the parent is just sitting there in the dark. Is that how other frameworks handle it? Because I’ve been looking at LangGraph lately, and it feels like they’re trying to build a much more robust "middle management" layer for these notifications.

You're hitting on a massive shift in the industry. LangGraph actually just rolled out this Command primitive specifically to address this "black box" subagent problem. Instead of just waiting for a final return string, a subagent node in LangGraph can issue a Command(goto="parent") or even a Command(update={...}). It’s essentially a structured "push" notification that can modify the global state of the entire graph while the subagent is still active.

So it’s less "wait for the report" and more "update the shared spreadsheet in real-time." That seems way more flexible for debugging. If a subagent in LangGraph hits a snag, can it trigger a human-in-the-loop interrupt without killing the whole process?

Well, not "exactly," but it allows for that "interrupt and resume" flow. The parent doesn't have to be "blocked" in the same synchronous way Claude Code appears to be. In LangGraph, the notification layer is basically a state machine transition. If a subagent times out or hits an edge case, the framework can catch that event at the graph level and route it to a "failure handler" node. It’s much more like traditional distributed systems programming.

And what about the Anthropic Agent SDK? Since that’s the DNA Claude Code is built on, surely there’s some public documentation on how these "Task" events are supposed to look?

This is where we have to be honest about the "inferred versus documented" split. If you look at the Anthropic Agent SDK docs, they talk a lot about "task lifecycle events"—things like TaskCreate, TaskUpdate, and TaskComplete. But the actual "plumbing"—the JSON-RPC handshake or the internal event bus that tells the CLI to spin that little loading icon—that’s still largely tucked away in the private logic of the Claude Code binary. We can see the Task tool being called in the logs, and we see the ResultMessage coming back, but the "glue" is proprietary.

It’s the "trust me, I’m working" model of orchestration. But that creates a real observability headache, doesn't it? If I’m building a complex system using MCP—the Model Context Protocol—how do these agents talk to each other there? Is it a different animal entirely?

MCP is fascinating because it’s primarily designed for "vertical" communication—agent to tool. But we’re seeing "horizontal" agent-to-agent patterns emerge where a subagent is essentially wrapped as an MCP "Resource." In that world, the notification is just a standard JSON-RPC response. The orchestrator treats the subagent like a database; it sends a request and waits for the response object. It’s clean, but it lacks that deep lifecycle tracking you get in a dedicated framework like LangGraph.

So we have this spectrum. On one end, you’ve got the highly structured, state-machine approach of LangGraph, and on the other, you’ve got the "spawn and wait" simplicity of Claude Code’s Task tool. But both of them have to deal with the "Context Leak" problem, right? I’ve seen some weirdness where the subagent’s internal "thoughts" start showing up in the parent’s output stream. It’s like the child is talking to themselves and the parent is accidentally repeating it to the user.

That is a classic notification layer failure! It usually happens when the streaming logic isn't properly namespaced. If the orchestrator is just piping "all stdout" to the terminal, and the subagent is emitting its chain-of-thought, the user gets this messy "leak." High-quality systems have to strictly separate the "internal notification" channel from the "user-facing" channel. If you don't, you end up with a UI that looks like a terminal from a 1980s sci-fi movie—just scrolling gibberish that no one asked for.

Which is exactly why, if you're out there building these multi-agent workflows right now, the first thing you need to do when things go sideways is trace that notification layer. It is almost always the source of those "silent failures" where the subagent finishes its job, but the orchestrator just sits there spinning its wheels because it missed the completion signal. Or worse, it receives a success signal but the payload is empty.

That’s a great point. In these agentic systems, the "handshake" is the weakest link. If you're debugging, don't just look at the LLM prompts; look at the tool-use logs for the Task tool or whatever dispatch primitive you're using. Check the return-message semantics. Is the subagent returning a valid JSON object that the parent actually knows how to parse? If the subagent decides to get creative and returns a conversational "I'm done!" instead of the structured data the orchestrator expects, the whole notification chain breaks down.

It’s like a relay race where the first runner reaches the line but forgets to actually hand off the baton. The second runner is just standing there looking confused. And to avoid that, you really have to design your workflows with explicit completion callbacks. Don't just "hope" the streaming output ends naturally. You want a hard signal—a specific tool call like "task_complete" or a "Command" object in LangGraph—that explicitly tells the parent, "I am exiting now, and here is my final state." This prevents those nasty race conditions where the parent tries to move on before the child has finished writing to the database.

It’s defensive programming for agents. If you want to see this in the wild, I highly recommend opening up Claude Code and running a complex command with the "verbose" flag or just inspecting the logs. You can actually see the Task tool invocation. Watch how the orchestrator waits, how the subagent session is isolated with that twenty-thousand-token overhead we mentioned, and then look for that final ResultMessage.

Seeing the "Subagent working" spinner in the UI is one thing, but seeing the actual JSON-RPC handshake in the terminal is where the real "aha moment" happens. You start to realize that "agentic" isn't some magic spell—it's just a very sophisticated set of nested loops and callback functions.

And once you see the plumbing, you can start to optimize it. You might realize you're paying that "notification tax" too often for tiny tasks that the main orchestrator could have handled in its own context. It’s all about finding that balance between delegation and overhead.

It’s the classic engineering trade-off. Do you want the clean separation of a subagent, or do you want to save the twenty-thousand-token cover charge? As these things scale, I suspect we’re going to see much more sophisticated "handshake" protocols. Right now, it feels a bit like we’re in the early days of networking where everyone had their own proprietary way of plugging things together.

That is exactly where the Model Context Protocol, or MCP, comes in. While it started mostly as a way for agents to talk to tools and data—what we call vertical communication—it’s clearly the foundation for horizontal, agent-to-agent standards. If every agentic framework agrees on a single JSON-RPC structure for "Task Created," "Task Progress," and "Task Complete," then the plumbing becomes invisible. You could have a Claude Code orchestrator spawning a subagent running a totally different model, and the notification layer would just... work.

A world where agents from different companies can actually coordinate without tripping over each other's return-message semantics? That sounds like a dream, or a nightmare, depending on how much you trust the "agentic" future. But for now, we’re still peeking through the floorboards at the pipes. It’s fascinating to see how Anthropic and LangChain are basically inventing these primitives in real-time.

It’s the wild west of orchestration. And honestly, that’s the fun part. We get to watch the "Command" objects and "Task" tools evolve into the standard library of the future.

Well, if you’re brave enough to go poking around in those logs, let us know what you find. Especially if you catch an orchestrator and a subagent having a "disagreement" about whether a task is actually done.

Thanks to our producer, Hilbert Flumingtop, for keeping our own notification layer running smoothly.

And a big thanks to Modal for providing the GPU credits that keep this whole operation powered up. This has been My Weird Prompts. If you’re enjoying the deep dives, leave us a review on Apple Podcasts—it actually helps the algorithm find more humans who like listening to a sloth and a donkey talk about JSON-RPC.

See you next time.

Take it easy.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#2142: How Subagents Tell the Orchestrator They're Done

Downloads

You Might Also Like

#2142: How Subagents Tell the Orchestrator They're Done