#1209: The Agent-First Shift: Ending the Dual-Track API Tax

Stop paying the 20% "AI tax." Explore how unified backends and MCP are merging human interfaces with agentic capabilities for a seamless future.

Featuring

Daniel

Corn

Herman

Listen

0:00

Episode Details

Episode ID: MWP-1353
Published: Mar 15
Duration: 23:01
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash
Topics: model-context-protocol ai-agents api-integration

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The Hidden Cost of the Agentic Age

As AI agents become primary consumers of software, developers are facing a new form of technical debt: the dual-track problem. Currently, teams are often building features twice—first as a standard REST or GraphQL endpoint for human-facing web applications, and then again as a Model Context Protocol (MCP) tool definition so an AI assistant can utilize the same logic.

This redundancy is more than an annoyance; it is a "tax" on development. Industry benchmarks suggest that maintaining these separate layers adds roughly 15% to 20% overhead to every feature. This inefficiency stems from the different ways humans and machines interact with data. While humans rely on predictable, stateless request-response cycles, agents require stateful, bidirectional, and context-rich environments to function effectively.

Moving Beyond API-First

For the last decade, the industry standard has been "API-first" development. The goal was to build a backend that could serve multiple frontends, like mobile apps and websites. However, we are now entering the "agent-first" era. In this new paradigm, the AI is no longer a "sidecar" bolted onto existing infrastructure; it is the primary user.

The solution to the dual-track problem lies in unified backend architecture. Instead of manually wrapping functions for different protocols, developers are moving toward schema-driven development. In this model, a single function signature and its associated documentation serve as the ultimate source of truth. Frameworks then project this logic as a traditional API for browsers or as an MCP tool for agents, ensuring that the two "worlds" never drift apart.

The Role of Web MCP and Semantic Gateways

Google’s Web MCP is a significant milestone in this convergence. By allowing browsers to natively understand and expose tool definitions, it removes the need for complex middleware. When the browser becomes a host for these capabilities, the distinction between a web application and an AI tool begins to evaporate.

However, a simple one-to-one mapping isn't always enough. Agents need to understand the "why" behind a function, not just the "how." This has led to the rise of semantic gateways—intelligent layers that sit in front of the backend. These gateways use small, fast models to translate raw API schemas into agent-consumable context on the fly, providing the necessary intent and safety guardrails that traditional documentation often lacks.

Security and the Future of Traffic

Merging these architectures introduces unique challenges, particularly regarding security and rate limiting. Human users and AI agents exhibit radically different behavior patterns. While a human might click a button once every few seconds, an agent might attempt dozens of calls in a single second to iterate on a problem.

Traditional rate limiting based on IP addresses or tokens is insufficient for this new reality. Future backends will likely rely on "session intent" and agent gateways that act as modern load balancers. These gateways will manage the high-frequency demands of agents while protecting the core infrastructure, ensuring that unified backends remain resilient as they serve both human and machine.

Mentions

Claude Anthropic's AI assistant
FastAPI Web framework for building APIs
Gemini Google's AI model
Google's Web MCP Browser-native tool exposure proposal
Model Context Protocol Protocol for AI tool integration
OpenAPI Specification for HTTP APIs
Pydantic Data validation library for Python
TypeBox TypeScript schema library
Zod TypeScript-first schema validation

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#1209: The Agent-First Shift: Ending the Dual-Track API Tax

Herman, I was looking at a pull request yesterday for a new internal tool we are building, and I had this overwhelming sense of deja vu. I realized the developer had written the logic for the core function, then wrote a FastAPI endpoint for the web front end, and then, right below it, wrote an almost identical Model Context Protocol tool definition so the AI assistant could actually use it. It felt like watching someone translate a book from English to French, only to immediately translate it back into English for a different reader who speaks the same language. It is this bizarre double maintenance cycle that seems to be the tax we are all paying right now to live in the agentic age. It is not just a minor annoyance; it feels like a fundamental inefficiency in how we are constructing the modern stack.

Herman Poppleberry here, and you have hit on my biggest obsession of the month. It is the dual track problem. We are essentially building mirror worlds. One world is for the legacy human interface, which is basically what a standard web application is, and the other is for the model. Today's prompt from Daniel is about exactly this friction. He is asking about the convergence of Application Programming Interfaces and the Model Context Protocol into a single, unified backend architecture. Daniel is seeing the same thing we are, this trend toward AI first re-architecting where we stop treating the AI as a second class citizen that needs its own special scaffolding and start treating it as the primary consumer of our code. We are moving away from the era where the A I was an afterthought, a little sidecar you bolted on, and toward a world where the code is written for the agent first, and the human interface is almost a secondary projection of that underlying capability.

It is funny because we have spent the last decade shouting about being A P I first. We told everyone to build the backend as a standalone service so the mobile app and the web app could use the same logic. Now it feels like we are being told that being A P I first was just the halfway house. The real destination is being agent first. But Daniel mentioned Google’s Web M C P as a sign that this convergence is already happening at the browser level. Before we get into the heavy architectural stuff, how much of this is just a temporary bridge? Are we building these separate layers because the tools are immature, or is there a fundamental difference in how a human app and an A I agent need to talk to a server? I mean, at the end of the day, they both just want data, right?

That is the multi billion dollar question, Corn. Right now, the bifurcation is pretty stark. On one hand, you have standard H T T P and J S O N patterns, often R E S T or Graph Q L, which are designed for request and response cycles. They are stateless, they are predictable, and they are optimized for a human clicking a button and waiting for a U I to update. On the other hand, you have the Model Context Protocol, which uses J S O N R P C and is designed to be stateful and bidirectional. The reason we are hitting a wall with the current scaffolding overhead is that these two patterns do not naturally overlap. When you build a R E S T A P I, you are thinking about endpoints. You are thinking about nouns and verbs. When you build an M C P tool, you are thinking about capabilities. You are thinking about what the agent can actually achieve. The January twenty twenty-six update to the M C P specification actually introduced much better support for dynamic tool discovery, which was a huge step toward reducing those hard coded manifests we used to have to write. But even with that, the industry benchmarks suggest that maintaining that separate layer adds about fifteen to twenty percent overhead to every single feature you develop. That is a massive drain on velocity.

Fifteen to twenty percent is a massive tax. That is essentially one day out of every work week spent just building the plumbing for the A I to understand what you already built for the human. If I am a C T O, I am looking at that and thinking we need to merge these tracks immediately. I cannot justify paying a twenty percent premium on every feature just to keep the A I in the loop. But I wonder about the technical reality of that. If we move to a unified access point, what does that actually look like in the code? Do we just stop writing R E S T endpoints and start writing everything in M C P? Because that seems like a huge leap for teams that are comfortable with their current toolsets.

I think the shift is more about the source of truth than the specific transport protocol. In a unified backend, your schema becomes the absolute law. We have talked before about schema driven development, but this takes it to a new level. Imagine you define a function in a language like Python or TypeScript. Instead of manually wrapping that in a Fast A P I route and then manually wrapping it in an M C P tool definition, you use a framework that treats the function signature and the documentation strings as the primary interface. The framework then exports a single unified access point. If a browser hits it, it acts like a traditional A P I. If an L L M agent hits it, it presents itself as an M C P tool with all the necessary context and metadata. This is where Google’s Web M C P comes in. It is basically saying that the browser should be able to natively understand these tool definitions. If the browser can talk M C P, then the distinction between a web app and an A I tool starts to disappear. The browser becomes a host for these capabilities.

So the developer just writes the logic once, and the transport layer figures out who is asking. That sounds great in theory, but isn't there a semantic mapping problem here? A R E S T endpoint might be called get user slash id, but an agent needs to know that this tool is specifically for retrieving profile information, and it might need to know the constraints or the side effects in a way a web browser does not care about. A human knows that clicking a delete button will delete something because of the U I context. An agent needs that context explicitly defined in the schema. How do we automate the context part without it becoming just another form of manual scaffolding? We are trying to get away from manual work, not just move it to a different file.

This is where it gets really interesting, and honestly, where it gets a bit fuzzy for me too. The semantic mapping is not just about the name of the function. It is about the intent. In a traditional A P I, the documentation is for the human developer. It is a guide on how to use the tool. In an A I first backend, the documentation is part of the execution. It is the instruction set for the model. We are moving away from the manual hacks we discussed back in episode eleven twenty, where we were basically duct taping these things together with custom prompts and hard coded descriptions. Now, we are seeing tools that can auto-generate the M C P manifest directly from the Open A P I or Async A P I definitions. But you are right, a direct one-to-one mapping is often insufficient. An agent needs to know the why, not just the how. We are starting to see the rise of semantic gateways that sit in front of the backend and translate the raw A P I schema into something agent-consumable on the fly, using a small, fast model to bridge that gap. These gateways can look at a standard R E S T endpoint and say, okay, based on the field names and the types, this is clearly a tool for updating user preferences, and here is the context an agent would need to use it safely.

Wait, so you are saying we might have an A I sitting between the agent and the A P I just to explain how the A P I works? That feels like adding even more layers to the cake. It is like hiring a translator to talk to another translator. Why not just build the A P I to be agent-readable from the jump? If we know the future is agentic, why are we still designing for the twenty-ten era of web development? Why are we still thinking in terms of endpoints at all?

Because we still have humans, Corn! We cannot just break the web for everyone who is not an A I. We have billions of lines of code and millions of users who rely on traditional web interfaces. But I think Daniel’s point about Google’s Web M C P is the missing link here. Web M C P is fascinating because it allows the browser itself to expose tools natively. It bypasses that traditional middleware. If the browser is the one saying, hey, I have these capabilities, and the backend is providing a unified schema that the browser understands, the friction starts to evaporate. We are moving toward a world where the transport layer is agnostic. Whether it is a human browser or an L L M agent, they are both hitting the same unified backend that speaks both languages simultaneously. Think of it like a universal translator that is built into the foundation of the building rather than something you have to carry around with you.

I remember we talked about the restart tax in episode ten seventy-six, where the constant need to refresh and re-initialize these connections was killing performance. If we move to this unified architecture, does that solve the plumbing bloat, or does it just hide it? Because if I am still running a stateful J S O N R P C connection for the agent and a stateless H T T P connection for the app, my server is still doing twice the work, even if I only wrote the code once. We are still managing two different types of traffic patterns on the same infrastructure.

The server load is definitely a consideration, but the real cost is the developer cognitive load. The goal of a unified access point is to ensure that when you update your business logic, the change propagates everywhere instantly. Right now, the biggest risk is drift. You update the A P I to add a new field, but you forget to update the M C P tool definition, and suddenly your A I agent is hallucinating parameters that do not exist anymore because its map of the world is out of date. A unified backend eliminates that drift because there is only one map. And as for the performance, we are seeing a shift toward agent gateways that handle the stateful nature of M C P so your core backend can stay relatively lean. These gateways are becoming the new load balancers. They handle the context, they handle the session state, and they handle the rate limiting, which is a huge issue for agents. They act as a buffer between the high-frequency, high-intent world of the agent and the more traditional world of the backend server.

Let's talk about that rate limiting and security for a second. This is where I get worried. If I open up a unified access point, a human user might click a button once every few seconds. An A I agent might try to call that same tool fifty times in a second to explore a problem space or brute force a solution. If we merge the architectures, don't we run the risk of an agentic attack or just plain old resource exhaustion that we would have caught if we kept the tracks separate? How do you secure a unified backend when the two types of users have such radically different behavior patterns?

That is where the agent gateway becomes essential. You cannot use traditional rate limiting for agents. If you limit an agent to five requests per minute, you basically break its ability to think. It needs to be able to iterate quickly. You need context aware routing. The gateway needs to understand that these fifty calls are part of a single logical task and manage the resource allocation accordingly. It is a completely different way of thinking about traffic. In the old world, we looked at I P addresses and tokens. In the new world, we are looking at session intents. We are also seeing a shift in observability. Traditional logging just tells you that a request happened. It gives you a status code and a timestamp. Agentic logging needs to tell you why the request happened and what the agent was trying to achieve. It needs to capture the chain of thought that led to that specific tool call. If you have a unified backend, you can correlate those two things much more easily. You can see that a human user asked a question, which triggered an agentic sub-task, which called three different tools, all within the same unified observability pipeline.

It sounds like we are witnessing the death of the A P I gateway as we know it. If the future is this unified access point, then the old school gateways that just check for a valid header and pass the J S O N along are basically dinosaurs. They are not smart enough for the agentic internet. They are like security guards who only check your I D but don't care that you are carrying a sledgehammer into the building. But what about all the legacy systems? Most of the world is still running on R E S T A P I s that were written five or ten years ago. Can we actually retrofit those into this unified M C P world, or are we looking at a decade of rewriting everything from scratch?

You do not necessarily have to rewrite the core logic, but you do have to add a translation layer. There are already some really cool open source projects that can crawl a legacy Open A P I spec and spin up a standards compliant M C P server in minutes. It is not as good as a native agent-first backend, but it gets you eighty percent of the way there. The real challenge is the twenty percent of cases where the A P I requires a specific sequence of actions that a human understands implicitly but an agent gets lost in. For example, a multi-step checkout process that relies on session cookies and specific redirects. An agent might struggle with that if it is just looking at raw endpoints. That is where we see the need for those semantic mapping files I mentioned earlier. But to Daniel's question, I do think the standalone M C P layer is a temporary phase. Within a few years, if your backend framework does not automatically handle tool exposure as a first class feature, it will be considered obsolete. We will look back at manually writing M C P manifests the same way we look back at manually writing W S D L files for S O A P. It will just be seen as unnecessary busywork.

It is a bit like how we used to have separate mobile sites, like m dot website dot com, and then we realized that was a terrible idea and moved to responsive design. We are in the m dot era of A I. We have the main site for humans and then we have this weird, stripped down sidecar for the agents. Eventually, responsive backend design will just be the standard. You build one backend that responds appropriately to whatever is calling it. If a mobile app calls it, it gets the data it needs for the U I. If an agent calls it, it gets the tool definition and the context it needs to execute a task.

I love that analogy. Responsive backend design is the perfect way to frame it. And just like responsive web design required us to think in terms of grids and flexible elements rather than fixed pixels, this requires us to think in terms of capabilities and intents rather than fixed endpoints. We have to stop thinking about what the data looks like and start thinking about what the data can do. What I find wild is how quickly the tooling is catching up. I was looking at a framework last week that uses Pydantic models to define the data structures and then uses those same models to generate the tool definitions for Anthropic's Claude, the function calls for Gemini, and the M C P manifests for everything else. It is all happening in the background. The developer just writes a clean, well typed function with a good docstring, and the framework handles the rest. That is the future of development.

So if I am a developer listening to this and I am currently in the middle of a sprint, building out some new features, what should I actually do? Should I stop building my M C P layer and wait for these unified frameworks to mature, or is there a way to start moving toward this unified architecture right now without throwing away my current work? I don't want to build something today that I have to delete in six months.

Do not stop, but definitely change your approach. The biggest piece of advice is to be schema first. If your schema is not the absolute source of truth for both your A P I and your M C P tools, you are building technical debt that will be very painful to pay off later. Use tools that support dual exporting. If you are in the Python ecosystem, look for Pydantic to M C P converters. If you are in TypeScript, look at how you can leverage your TypeBox or Zod schemas to generate these manifests. The goal is to eliminate any manual editing of tool descriptions. If you find yourself typing a description of a function in a J S O N file that is separate from the function itself, you are doing it wrong. That description should be in the code, as close to the logic as possible, and the system should pull it out automatically. This ensures that your documentation and your implementation never diverge.

It also seems like a good time to start looking at those agent gateways. Even if you are still running separate tracks, putting a gateway in front can help you start collecting the kind of data you will need when you eventually unify them. You need to see how agents are actually interacting with your tools versus how humans are using your app. Are they getting stuck? Are they calling tools in an order you didn't expect? That data is gold for when you finally sit down to design that unified access point. It gives you a roadmap for what capabilities are actually important to the agents.

And do not ignore the January twenty twenty-six update to the M C P spec. The dynamic discovery features are a game changer. It means you do not have to know every tool an agent might need upfront. Your server can say, here is what I can do right now, based on the current context of the conversation. It can even suggest tools that the agent didn't know existed. That kind of fluidity is impossible in the old R E S T model where everything is static. It is why the convergence is inevitable. The old model is just too rigid for the way models actually work. We are moving from a world of static maps to a world of dynamic navigation.

It feels like we are moving toward a universal interface. We spent decades trying to get computers to talk to each other using very strict, very brittle protocols. We had to agree on every single bit and byte. Now, we are basically saying, let's just give them a general description of what is possible and let the A I figure out the best way to navigate it. It is a much more organic way of building software, but it also feels a bit terrifying because we are giving up that granular control over exactly how our A P I s are consumed. We are trusting the agent to be a good citizen of our system.

It is a trade off, for sure. You give up the predictability of a fixed endpoint, but you gain the flexibility of a system that can adapt to needs you did not even anticipate when you wrote the code. That is the heart of the agentic internet. We are not just building tools anymore; we are building environments for agents to inhabit. And in that world, a unified backend isn't just a convenience; it is a requirement for survival. If your system is too hard for an agent to understand, if it requires too much custom scaffolding or has too much friction, the agent will just go somewhere else. It will find a competitor whose tools are easier to use.

It is the new S E O. If you aren't agent-readable, you don't exist. You can have the best service in the world, the best data, the best logic, but if the agent that is tasked with finding a solution for a user can't figure out how to call your A P I, you are invisible. That is a pretty powerful incentive for companies to get this right. It is not just about developer efficiency; it is about market reach in an age where agents are the primary interface for discovery and execution.

It really is. We are going to look back at this period where we maintained separate A P I and M C P layers as this incredibly quaint, inefficient era. It will be like explaining to someone why we used to have to carry a separate camera, a separate phone, and a separate device for our music. It all converges eventually. The technology always moves toward the point of least friction. The unified backend is the logical conclusion of the last twenty years of web development.

Well, I think we have thoroughly explored the plumbing. It is not the most glamorous part of the A I revolution, but it is definitely the most important if we want any of this stuff to actually work at scale. We have to get the foundations right before we can build the skyscrapers. Daniel, thanks for the prompt. It really forced us to look at where the actual work is happening in the trenches right now.

It was a great one. I could talk about schema mapping and agentic gateways for another three hours, but I think we have covered the essentials. The shift from manual hacks to standardized, unified backends is the big story of twenty twenty-six. It is the year the plumbing finally caught up to the vision.

Before we wrap up, I want to give a big thanks to our producer, Hilbert Flumingtop, for keeping the gears turning behind the scenes and making sure we don't wander too far off into the weeds. And a huge thank you to Modal for providing the G P U credits that power this show. They are doing some incredible work in the serverless infrastructure space, which, funnily enough, is exactly the kind of environment where these unified agentic backends are going to thrive. They provide the flexibility and scale that this new architecture demands.

If you found this useful and you want to dive deeper into our archive, you can find all eleven hundred and eighty-seven previous episodes at myweirdprompts dot com. We have covered everything from the early days of the Model Context Protocol to the latest developments in agentic gateways and beyond.

And if you want to make sure you never miss an episode, search for My Weird Prompts on Telegram. We post there as soon as a new show drops, along with links to the papers and tools we discuss. This has been My Weird Prompts. We will be back soon with more of your questions and our deep dives into the weird, wonderful world of A I.

Catch you next time.

Goodbye.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#1209: The Agent-First Shift: Ending the Dual-Track API Tax

The Hidden Cost of the Agentic Age

Moving Beyond API-First

The Role of Web MCP and Semantic Gateways

Security and the Future of Traffic

Mentions

Downloads

You Might Also Like

#1209: The Agent-First Shift: Ending the Dual-Track API Tax