#855: The Agentic Internet: Google’s New Web MCP Standard

AI agents are moving beyond "looking" at websites. Discover how Google’s Web MCP creates a programmatic map for the agentic future.

0:000:00

Episode Details

Published: Feb 26
Duration: 34:45
Audio: Direct link
Pipeline: V4
TTS Engine
LLM

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The digital landscape is moving toward a seismic shift in how users interact with the web. For decades, the internet has been built for human eyes, prioritizing visual layouts, typography, and "vibes." However, as artificial intelligence agents become the primary navigators of the web, the industry is moving toward an "agentic internet"—a web designed specifically for autonomous machines to execute tasks.

The End of Vision-Based Navigation

Until recently, AI agents navigated websites much like humans do: by "looking" at the screen. Using vision models, these agents take screenshots, identify buttons, and guess where to click. While impressive, this method is fundamentally brittle. A minor change in CSS, a shifting button, or a sudden pop-up can cause an agent to fail. This process is also resource-intensive and slow, making it an unreliable foundation for a digital economy.

The emergence of the Web Model Context Protocol (Web MCP) signals a move away from this visual guesswork. Pioneered by Anthropic and now being implemented by Google in Chrome, Web MCP provides a programmatic map of a website’s functionality. Instead of guessing what a button does, an agent can receive a structured list of "tools" or functions directly from the website via the browser.

The Programmatic "Kitchen"

This transition introduces a dual-layered approach to web development. One way to visualize this is through the metaphor of a restaurant. The visual web is the "dining room"—it is designed for the human experience, featuring aesthetics and atmosphere. The Web MCP layer is the "kitchen"—an efficient, structured environment designed for output and execution.

In this new era, websites may register specific tools, such as a "flight search" or "checkout" function, directly with the browser. When an agent arrives, it doesn't need to parse the visual dining room; it communicates directly with the kitchen using structured data. This reduces hallucinations and errors, as the website provides a "source of truth" for its own capabilities.

Challenges and the Future of Standards

While the efficiency gains are undeniable, the shift to an agentic web presents significant challenges. Security remains a primary concern; the browser must act as a firewall and mediator to ensure malicious sites cannot trick agents into unauthorized actions. There is also the risk of a fragmented web. If major browsers like Safari or Firefox do not adopt Web MCP, developers may face a "browser war" reminiscent of the early 2000s, where sites must be optimized for different agent protocols.

Furthermore, small developers may find it burdensome to maintain programmatic interfaces alongside their visual sites. However, as AI tooling evolves, automated systems will likely help bridge this gap. While vision models will remain a necessary fallback for legacy sites, the future of the web belongs to programmatic interfaces that prioritize reliability and speed over visual flair.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

Episode #855: The Agentic Internet: Google’s New Web MCP Standard

Daniel's Prompt

"I’d like to discuss the emerging 'agentic internet' and Google’s recent announcement of Web MCP (Web Model Context Protocol). This new browser-level standard aims to let AI agents interact with websites through structured APIs rather than less reliable methods like screen scraping or vision-based models. Do you think this is the beginning of a new direction where programmatic interfaces become the dominant approach over visual ones? What are the implications for developers, and how important is it for this protocol to become a cross-browser standard?"

Hey everyone, welcome back to My Weird Prompts. I am Corn, and I have been thinking all morning about how much the internet has changed just in the last few months. It feels like we are standing on the edge of a massive shift in how we actually use the web. It is not just about faster speeds or better graphics anymore. We are talking about a fundamental change in the relationship between the user, the browser, and the website itself.

Herman Poppleberry here, and you are not wrong, Corn. We are seeing the infrastructure of the digital world being rewritten in real time. It is like we spent thirty years building a library for people who like to browse the stacks, and suddenly, everyone is sending robots to do their research for them. Today is going to be a deep one because our friend Daniel sent us a prompt that gets right into the gears of this transformation. Daniel is a regular listener who works in tech communications, and he has a knack for spotting the seismic shifts before they hit the mainstream.

Yeah, Daniel is asking us about the emerging agentic internet and specifically Google's recent announcement regarding Web M C P, or the Web Model Context Protocol. It is a browser-level standard designed to let A I agents interact with websites through structured interfaces rather than just looking at the screen like a human would. For those who haven't been following the developer blogs, this is essentially Google saying that the old way of agents "surfing" the web is dead.

This is a huge deal. For the last year or so, we have seen agents trying to navigate the web by basically taking screenshots and guessing where to click. We call it vision-based navigation, and while it is impressive that a model can "see" a button, it is a terrible way to run a digital economy. It is slow, it is prone to errors, and it is incredibly resource-intensive. If the sun reflects off a certain part of a U I element in a weird way, or if a pop-up appears, the agent gets confused. But Web M C P changes the game by giving these agents a direct, programmatic map of what a site can actually do.

It really feels like we are moving away from the era of the human-centric web where everything had to be visual. If Daniel is right, and this becomes the dominant approach, the way we build websites is going to change forever. I want to dig into whether this is truly the beginning of a new direction or just another protocol that might get lost in the shuffle. Because let us be honest, we have seen plenty of "standards" come and go that were supposed to revolutionize the web. Remember semantic web tags? We were told those would change everything, and yet here we are, still scraping H T M L.

I think it is the former, Corn. This isn't just a meta-tag. When Google moves on something like this, given their control over the Chrome ecosystem, the industry tends to follow. To understand why this matters, we have to look at the current state of A I agents. Right now, if you ask an agent to go buy you a pair of running shoes, it has to load the page, use a vision model to identify the search bar, type in the query, wait for the results, and then try to distinguish between an ad and a real product based on visual cues. It is like trying to navigate a city by looking through a keyhole while wearing mittens.

That is a great point. It is brittle. If a developer changes a C S S class or moves a button two pixels to the left, the agent might completely lose its way. I remember we touched on this transition from chat to action back in episode seven hundred ninety-five when we talked about sub-agent delegation. But this feels like the physical layer of that transition. It is the bridge between the A I's intent and the website's functionality. It is the difference between an agent guessing what a "Submit" button does and the website explicitly telling the agent, "This function processes a payment for the items in the cart."

Web M C P is essentially an extension of the Model Context Protocol that Anthropic pioneered. The idea is to create a standard way for a model to say, hey, what tools do you have available? And for the environment, in this case, the web browser, to respond with a list of structured functions. Google's implementation, which they are testing in Chrome version one hundred forty-six, introduces a new A P I called navigator dot model context dot register tool. This allows a website to explicitly tell the browser, I have a tool for searching flights, and here are the specific parameters I need, like origin, destination, and date.

So, instead of the agent trying to find the date picker on a calendar widget, which is a nightmare for A I because every calendar looks different, it just sends a structured data packet to that registered tool. That seems infinitely more reliable. But I wonder about the philosophy of this. For thirty years, we have built the web for human eyes. We care about typography, layout, and color. We care about the "vibe" of a brand. If the agentic internet takes over, does the visual layer become secondary? Do we start building warehouses instead of storefronts?

That is one of the biggest questions in web development right now. If sixty percent of your traffic is coming from autonomous agents rather than human beings clicking around, why spend millions on a flashy front-end? You might prioritize the robustness of your M C P implementation over your hero image. But I do not think the visual web dies. It just becomes one of two parallel interfaces. We will have the visual layer for humans and the programmatic layer for agents. Think of it like a restaurant. The dining room is the visual web—it has the decor, the music, the lighting. But the kitchen is the M C P layer. It is efficient, structured, and designed for output. The agent doesn't need to see the dining room; it just needs to talk to the kitchen.

It reminds me of the early days of S E O, where we were trying to make sites readable for Google's crawlers. We added alt text and meta descriptions so the "bots" could understand the context. But this is S E O on steroids because the agent is not just indexing the page; it is performing actions. It is completing a checkout, it is booking a reservation, it is interacting with a database. In episode seven hundred fifty-three, we talked about agentic behavior optimization, and this protocol feels like the standard that makes that optimization possible. If you want your site to be "chosen" by an agent, you better have a flawless M C P implementation.

It definitely does. And the technical specifics of how Google is doing this are fascinating. By putting it at the browser level, they are acting as the mediator. The browser becomes the security sandbox and the translator. When a website registers a tool via Web M C P, the browser can verify that the site actually has the authority to perform those actions. It also means the A I model does not need to know the specific code of every website. It just needs to know how to talk to the browser's M C P interface. This solves the "hallucination" problem too. If an agent tries to call a function that doesn't exist, the browser just returns an error, rather than the agent hallucinating a successful click.

I can see why developers would be excited, but I also see a lot of work ahead. If I am a developer, I now have to maintain this list of registered tools and ensure they stay in sync with my back-end logic. It is almost like maintaining a public A P I for every single interaction on my site. Is that a burden that small developers can handle? Or are we going to see a world where only the big players like Amazon and Expedia can afford to be "agent-friendly"?

It is a challenge, but I think the tooling will catch up. We are already seeing frameworks that can auto-generate these M C P manifests based on your existing A P I endpoints. The real value for a developer is that they regain control. Right now, if an agent scrapes your site, you have no control over how it interprets your data. It might get the price wrong, or it might miss a crucial disclaimer. With Web M C P, you are providing the source of truth. You are saying, if you want the price, call this function, and I will give you the exact number in a structured format. It reduces the "noise" of the web.

That control is huge for trust. I think about someone like Daniel, who works in technology communications and automation. For someone like him, having a predictable way to interact with web services is the holy grail. But what about the cross-browser aspect? Daniel mentioned how important it is for this to become a standard. If this is only in Chrome, does it actually solve the problem? Or do we just get another "Best Viewed in Chrome" era?

That is the million-dollar question. If Safari and Firefox do not adopt Web M C P, we end up in a fragmented web again. Developers hate having to implement one protocol for Chrome agents and another for Apple's Intelligence agents. We saw this in the early two thousands with the browser wars, and it was a mess. However, there is a lot of pressure on Apple and Mozilla to play ball here because the efficiency gains are too large to ignore. If an agent on Chrome can complete a task in three seconds using Web M C P, while a Siri agent on Safari takes twenty seconds because it is trying to use vision to navigate the same site, the user experience gap becomes a chasm. Users will switch browsers just to have more capable agents.

And Apple is already leaning heavily into their own App Intents framework for local apps. It would make sense for them to extend that logic to the web. But they are always very cautious about privacy. One of the concerns with a protocol like this is how much information the browser is giving away about the user's state to these agents and the websites they visit. If the browser is registering tools, is it also sharing my browsing history or my identity tokens with the agent automatically?

Security and privacy are the two biggest hurdles. If a malicious website can register a tool that looks like a legitimate checkout function, could it trick an agent into sending payment details to the wrong place? The protocol has to have very strict handshake requirements. Google's current proposal includes a lot of permission prompts. The browser will likely ask the user, do you want to allow this agent to use the booking tool on this website? It is not just a free-for-all. The browser acts as a firewall between the agent's intent and the website's execution.

I like that the browser stays in the middle. It is like a digital handshake where the browser verifies the identity of both parties before letting the data flow. But let us talk about the move away from vision models. Daniel mentioned he thinks programmatic interfaces will become dominant. Herman, do you agree? Or do vision models still have a place in the agentic internet? I mean, vision is so flexible. It can handle a site that hasn't been updated since nineteen ninety-nine.

I think vision models become the fallback. They are the generalists. If a site hasn't implemented Web M C P yet, the agent will have to use vision to get by. But vision is expensive. It requires a lot of tokens and a lot of compute time. Programmatic calls are cheap and fast. In a world where companies are trying to run millions of agents simultaneously, the cost difference is going to drive everyone toward programmatic interfaces. It is the difference between a human reading a map and a G P S system using coordinates. One is for exploration; the other is for execution.

That makes a lot of sense. The efficiency is just too high to ignore. But it also changes the power dynamic of the web. If agents are the primary navigators, the platforms that control the protocols, like Google with Web M C P, hold a lot of keys. They become the gatekeepers of how agents see the world. If Google decides that a certain type of tool registration is "unsafe," they could effectively de-list an entire category of web functionality from the agentic ecosystem.

They do. And that is why the open-source community is so focused on making sure M C P remains an open standard. Anthropic started it, and Google joining is a good sign that it won't just be a proprietary silo. But we have to stay vigilant. We don't want a situation where only certain agents are allowed to use certain tools based on which browser you are using. We need a W three C standard that everyone agrees on.

It is a fascinating tension between the need for a standard and the competitive desire to own the platform. I want to talk more about what this means for the actual design of the internet. If we are moving toward this programmatic world, what happens to things like advertising and discovery? If I never see the homepage, I never see the banner ads. But before we get into that, I think we have a lot more to cover regarding the developer experience.

We do. There is a whole new layer of the stack being built right now. Developers are going to have to think about state management in a completely different way when an agent can jump into the middle of a flow without seeing the previous three pages.

Dorothy: Herman? Herman, are you there? Is this working?

Mum? Wait, how did you get on this line? I am in the middle of recording the show right now. I thought I locked the studio channel.

Dorothy: Oh, I am sorry, bubbeleh. I just wanted to ask, did you find that green lid for the Tupperware? I need it back because I am making the brisket for Shabbat dinner and that is the only one that doesn't leak in the car. I checked the pantry and it is not there.

Hi Dorothy! Don't worry, we are almost halfway through. It is good to hear your voice.

Mum, Corn says hi, but I really have to go. I will check the dishwasher for your lid as soon as we finish. I promise. I am talking about the future of the internet right now, it is very important.

Dorothy: Okay, sweetheart. And tell Corn I hope he is eating enough. You both sound so busy. Just find the lid, please. It has the little tab on the corner. Bye bye!

I am so sorry about that. She must have hit the speed dial on my computer. I gave her an "emergency" button on her tablet and she seems to think a missing Tupperware lid qualifies as a Level Five emergency. Where were we?

We were talking about the shift in state management for developers. And honestly, it is a perfect transition. Your mum calling in is a great example of an unexpected interrupt in a flow. In a human-centric web, we assume a linear path. Page one, then page two, then page three. We guide the user through a funnel. But an agent might just call the checkout tool directly if it already has the context it needs from a previous session or a different site.

And that is a huge technical challenge. If an agent calls the register tool A P I for a checkout, the website has to ensure that all the necessary prerequisites are met. Did the user actually add the item to the cart? Is the session valid? Is the inventory still available? Developers are going to have to build much more robust state validation for these programmatic endpoints than they ever did for their visual front-ends. You can't just rely on the fact that the "Next" button was hidden until the user clicked "Agree." The agent doesn't care about hidden buttons.

It sounds like we are moving back toward a more decoupled architecture. We went from simple H T M L to these massive, complex single-page applications where everything is tied together in a giant Javascript bundle. Now, Web M C P is pushing us back toward a world where functionality is exposed as discrete, callable units. It is almost like the web is becoming a massive, global library of functions. We are moving from "Web Pages" to "Web Capabilities."

That is exactly what it is. It is the functionalization of the internet. And for developers, this is actually a return to some very sound engineering principles. It encourages clear inputs, clear outputs, and minimal side effects. It is basically Microservices for the Front-end. But it also means you can't hide messy logic behind a pretty U I anymore. If your tool registration says you take a string for a date, but your back-end expects a timestamp, the agent is going to fail, and it won't be able to figure out why by looking at the page. The "forgiveness" of the visual web is disappearing.

This brings us to the importance of documentation and metadata. In the past, documentation was for other developers. It was something you did if you had extra time. Now, the documentation is for the A I. The descriptions you provide in your Web M C P registration are the primary way the agent understands what the tool does. If you write a poor description, the agent might never use your tool, or it might use it incorrectly. We are entering an era where "Prompt Engineering" meets "A P I Documentation."

This is what we call context engineering. We talked about this in episode eight hundred nine. It is no longer just about writing code; it is about providing the right context so the A I can make the right decision. In Web M C P, every tool has a description field. That field is arguably as important as the code itself. You have to explain to the model exactly when to use this tool, what the trade-offs are, and what the expected outcome is. If you have two tools—one for "Express Shipping" and one for "Standard Shipping"—you need to clearly define the cost and time differences so the agent can choose based on the user's preferences.

I wonder how this affects the discovery of new sites. If I am an agent, and I am looking for a place to buy shoes, do I just search a giant registry of M C P tools? Does the search engine of the future index capabilities instead of keywords? Right now, I search for "best running shoes" and I get a list of articles. In the agentic future, does my agent search for "sites with a shoe-purchase tool and high trust ratings"?

I think that is exactly where we are heading. Imagine a search engine where you don't type in running shoes but instead, your agent queries an index for sites that have a registered tool for shoe inventory with specific support for wide-width sizes. The search results aren't links; they are A P I definitions. Your agent then picks the three best ones, calls their inventory tools in the background, compares the actual real-time stock, and presents you with the final options. The blue link is dead in that scenario. The "Search Result" is an action, not a destination.

That is a massive shift for the economy of the web. If I am a site owner, I rely on people visiting my site so they can see my other products, sign up for my newsletter, or see my ads. That is how I pay for the servers. If an agent just calls my inventory tool, gets the data, and leaves, I lose all that secondary value. I lose the "impulse buy" at the checkout counter. How do we solve for that? Does the web become a subscription-based service for agents?

That is the big tension. We might see a new kind of monetization where sites charge a tiny fraction of a cent for every M C P tool call. Or perhaps the browser facilitates a value exchange where the agent has to acknowledge certain brand messages or "sponsored capabilities." It is a problem that hasn't been solved yet, and it is why some publishers are terrified of the agentic web. But the current system of screen scraping is already stripping away that value. Web M C P at least gives the site owner a seat at the table to define how that interaction happens. You can build "Value-Add" into your tool responses.

It feels like the internet is becoming more like a giant operating system. The browser is the kernel, the websites are the applications, and the M C P tools are the system calls. When you look at it that way, it makes total sense why Google wants to own this. They want to be the ones defining the system calls for the entire web. If you control the protocol that agents use to talk to the world, you control the world.

They definitely do. And if you are a developer right now, the best thing you can do is start experimenting with these protocols. Even if you don't implement Web M C P today, you should be looking at your site and asking, if an A I were to use this, what are the five primary functions it would need? Are those functions easy to call programmatically, or are they buried under layers of complex Javascript and "Login" walls? If your core value is hidden behind a complex U I, you are going to be invisible to the agents of twenty twenty-six.

It is a great exercise in core functionality. It forces you to strip away the fluff. I also think about the implications for accessibility. This is a point that doesn't get enough attention. If we build a web that is perfectly readable for A I agents, we are often, by extension, building a web that is much better for screen readers and other assistive technologies. Structured data is good for everyone. It is a win for inclusivity.

That is a silver lining that often gets overlooked. A more machine-readable web is a more accessible web. If an agent can understand your checkout flow because it is exposed via a clear M C P tool, a screen reader for a visually impaired user will likely have a much easier time as well. It is all about semantic clarity. We are finally fulfilling the promise of the Semantic Web, just twenty years later and for a different kind of "user."

So, looking at Daniel's question about whether this is the beginning of a new direction where programmatic interfaces dominate visual ones. I think the answer is a resounding yes for the functional web. For things like social media, entertainment, and browsing for inspiration, the visual web will always be king. You can't "programmatically" enjoy a sunset or a funny video. But for the utility web—for getting things done, for the "chores" of life—the visual layer is going to become the secondary interface.

I agree. The utility web is going to be dominated by these structured protocols. And the cross-browser standard part is crucial. We need organizations like the W three C to step in and ensure that Web M C P doesn't become a Google-only feature. We need a neutral ground where all the major players agree on how these agents should talk to the world. If we get that, we unlock a level of productivity that is hard to even imagine right now. We are talking about an internet that works for us, rather than us working for the internet.

It is interesting to see how this connects back to our discussion in episode eight hundred ten about the agentic interview. We talked about how A I learns to know you by looking at your data. In a Web M C P world, the agent isn't just learning about you; it is learning about the capabilities of the world around you. It is building a map of what is possible. It is like the agent is gaining "fingers" to touch the web, rather than just "eyes" to look at it.

And that map is constantly updating. Every time a new site comes online with its own set of registered tools, the collective intelligence of the agentic internet grows. It is a much more dynamic and interconnected web than the one we have lived in for the last thirty years. We are moving from a web of documents to a web of actions.

I am curious about the timeline. Google is testing this in Chrome one hundred forty-six. We are currently in early twenty twenty-six. When do you think we see this become the standard way we interact with the web? When does the average person start using an agent that relies on M C P?

I think we are looking at a two to three-year transition. By twenty twenty-eight, I suspect that most major e-commerce and service-oriented sites will have some form of M C P support. The efficiency gains for the big A I providers like OpenAI, Anthropic, and Google are just too significant. They will start prioritizing sites that support these protocols in their agentic responses. If your site is easy for an agent to use, you get the business. If it is hard, you don't. That is a very powerful incentive. It is the new Darwinism of the web.

It is the new S E O. Instead of search engine optimization, it is agentic interaction optimization. I can see the agencies forming already. People will specialize in writing the best M C P tool descriptions and optimizing the state management for agentic calls. "We help your website talk to A I" will be the new marketing pitch.

It is already happening. We are seeing the rise of the context engineer as a legitimate job title. Their job is to make sure the bridge between the A I and the code is as smooth as possible. And Web M C P is the primary tool in their kit. It is a great time to be a developer who understands both L L Ms and web standards.

This has been such an enlightening discussion. Daniel always knows how to pick the topics that are just about to explode. I feel like we have covered a lot of ground here, from the technical implementation of navigator dot model context to the broader economic and philosophical shifts. It is a lot to take in, but it is also incredibly exciting.

It really is. And for our listeners who are developers, the takeaway is clear. Don't wait for the standard to be finalized to start thinking about this. The shift toward the agentic internet is happening whether we are ready or not. Start thinking about your website as a collection of capabilities, not just a collection of pages. How would an A I use your site? If you can answer that, you are already ahead of the curve.

That is a great way to put it. A collection of capabilities. I am going to be thinking about that for the rest of the day. And I am also going to be thinking about your mum's green Tupperware lid, Herman. You should probably go find that before she calls back and interrupts our outro.

Yeah, she is not going to let that one go. Shabbat dinner is non-negotiable in the Poppleberry household, and a leaky brisket is a family tragedy. I will be checking under the sink as soon as we hit stop.

Well, before you go hunting for plastic lids, let us wrap this up. If you enjoyed this episode, we would really appreciate it if you could leave us a review on your favorite podcast app. It honestly helps other curious people find the show, and we love reading your feedback. It keeps us motivated to keep digging into these weird prompts.

It really does make a difference. And if you want to dive deeper into some of the things we mentioned today, you can check out episode seven hundred fifty-three on agentic behavior optimization or episode eight hundred ten on how A I learns to know you. Both are great companion listens for today's topic and will give you a broader context of where this is all heading.

You can find all our past episodes and a whole lot more at myweirdprompts dot com. We have a full archive there, plus a contact form if you want to send us your own prompt. We are always looking for new ideas, especially ones as forward-thinking as Daniel's. You can also reach us at show at myweirdprompts dot com.

And a quick reminder that you can listen to us on Spotify, Apple Podcasts, or wherever you get your audio fix. Our music is generated with Suno, which is another great example of the kind of A I tools we are always exploring here. It is all part of the same rapidly evolving ecosystem.

Thanks again to Daniel for the prompt. It was a great one that really forced us to look at the plumbing of the future web. We will be back soon with more deep dives into the weird and wonderful world of A I and technology.

Until next time, I am Herman Poppleberry.

And I am Corn. This has been My Weird Prompts. Thanks for listening!

Goodbye everyone! I am off to find that lid!

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.