#1767: From Eyeballs to Tokens: The Web's Agentic Shift

The web's new primary user isn't human—it's AI. See how JavaScript evolved to serve autonomous agents.

Featuring

Daniel

Corn

Herman

Listen

0:00

Episode Details

Episode ID: MWP-1921
Published: Mar 29
Duration: 23:43
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash
Topics: ai-agents web-agentic-shift web-mcp

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The web as we know it is undergoing a fundamental identity shift. For decades, we've designed websites for human eyes and thumbs—users who click buttons, scroll pages, and read text. But in early 2026, a new primary user is emerging: the AI agent. These autonomous systems don't see pixels; they parse DOM structures, navigate component trees, and execute complex tasks through programmatic browser control. This transition from "eyeballs and thumbs" to "tokens and context windows" represents one of the most significant transformations in web development history.

The JavaScript ecosystem has been on an unlikely trajectory toward this reality. What began in the mid-1990s as a way to make images flicker on web pages has evolved into the primary interface for machine interaction. The language's chaotic early years—dismissed as a "toy" for form validation and annoying pop-ups—gave way to a dramatic renaissance. The pivot point arrived around 2009-2010 with Node.js bringing JavaScript to the server and the npm registry exploding to over two million packages by 2023. This massive ecosystem, while sometimes fragmented and dependency-heavy, created the infrastructure for modern web applications.

The "Great Framework Wars" of the 2010s—Angular versus React versus Vue—may have seemed like endless churn, but they drove a crucial architectural evolution. We moved from spaghetti code directly manipulating the Document Object Model to component-based architectures. React components and Vue single-file components structured the web into logical, reusable blocks. While this was done to improve developer experience, it accidentally created a web that was far more digestible for machines. An AI agent looking at a modern Next.js page doesn't see a wall of text; it sees a structured tree of components with defined props and states.

This architectural shift coincided with a pendulum swing back toward the server. After the era of heavy single-page applications that melted browsers under 500-megabyte JavaScript bundles, developers rediscovered server-side rendering, static site generation, and hybrid approaches. Frameworks like Next.js and Nuxt mastered this balance, improving both performance and SEO. But there's a deeper reason this matters for the agentic web: search engines are essentially the ancestors of today's AI agents. If your content is buried under layers of asynchronous JavaScript, neither can see it.

The most significant development in early 2026 is Google's Web Model Context Protocol (Web MCP) in Chrome Canary. This protocol represents a fundamental shift from "scraping" to "interacting." Instead of agents guessing what a button does by reading "Submit" text, sites can explicitly define machine-readable interactions. It's the difference between an agent trying to reverse-engineer a page's functionality and having a clear API for executing actions.

This is where accessibility becomes unexpectedly crucial. For decades, accessibility advocates have pushed for semantic HTML—using proper tags like "header," "main," "nav," and "button" instead of generic divs with click handlers. Screen readers for visually impaired users need this structured, non-visual representation of a page. It turns out AI agents are remarkably similar to screen readers. They interact with the browser's accessibility tree, not the visual rendering. A div with an onclick event might look fine to humans but is invisible to machines. The "virtue" of building accessible sites has created a web that's inherently future-proof for AI.

Structured data provides another critical layer. Schema.org and JSON-LD allow developers to embed machine-readable maps directly into HTML. When an agent encounters a product page, it doesn't need to guess which number is the price versus shipping cost—the schema explicitly defines "price: $19.99, currency: USD." This is like providing a menu with pictures for someone who doesn't speak the language.

TypeScript has become the baseline for agent-friendly development. While it once seemed like unnecessary ceremony—defining that a variable is a string when you already know it—type safety provides the boundaries that large language models need. Without clear schemas, agents hallucinate: they might interpret an "Add to Cart" button as a link to a banana bread recipe. In programmatic browser use, such hallucinations lead to catastrophic failures—incorrect orders, deleted data, massive compute waste.

The emerging "Agentic Browser" concept demonstrates this evolution in action. Tools now allow users to say, "Find me the best flight to Tokyo and book it if under $800," and agents execute the entire workflow. Early implementations used Puppeteer scripts generated on-the-fly by LLMs, but these were fragile—changing a CSS class from "btn-primary" to "button-main" would break everything. Modern approaches like Browser MCP are more robust, interacting with the accessibility tree rather than relying on brittle DOM selectors.

This raises a profound question about the web's future. If agents can fetch data and present it in clean interfaces, why would anyone visit actual websites? The traditional ad-based business model collapses when users stop viewing pages and start receiving answers. We're seeing tension between content creators and AI companies, but from a technical standpoint, this is pushing us toward an "API-first" world. The website becomes just one client of the data, with AI agents as another.

GitHub exemplifies this API-first architecture. You can do almost anything through their API that you can do through the web interface, which is why AI agents can automate repository management, pull requests, and issue tracking with minimal friction. This same principle applies to any modern application: build as if your user is an API call, even when you're building a GUI.

The practical implication for developers is clear. Whether building a simple weather app or a complex e-commerce platform, the architecture must serve both human and machine users. This means semantic HTML, structured data, TypeScript schemas, and API-first design. The web of tomorrow won't be built for eyeballs or thumbs—it will be built for tokens, and the developers who understand this shift will be best positioned for what comes next.

Mentions

GitHub Copilot AI pair programmer for IDEs
JSON-LD Linked data format for structured data
Next.js React framework for server-side rendering
Node.js JavaScript runtime built on V8
Playwright Browser automation library by Microsoft
React JavaScript library for building UIs
Schema.org Vocabulary for structured data markup
Tailwind CSS Utility-first CSS framework
TypeScript Typed superset of JavaScript
Web MCP Protocol for machine-readable web interactions

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#1767: From Eyeballs to Tokens: The Web's Agentic Shift

Imagine a website where the primary user isn't a human clicking buttons, but an AI agent parsing DOM structure to execute a complex task. That’s not a sci-fi pitch anymore; it’s the reality of the web in early twenty-six. Today's prompt from Daniel is about the evolution of JavaScript and this massive shift toward the agentic web. We’re looking at how a language that started as a way to make images flicker is now the interface for autonomous machines.

It is a fascinating trajectory, Corn. And just to set the stage, today’s episode is powered by Google Gemini three Flash. It’s fitting because we’re talking about the very models that are increasingly becoming the "users" of the code we write. When you look at the landscape right now, especially with Google’s Web Model Context Protocol—or Web MCP—standard gaining traction, the traditional boundaries of web development are dissolving. We aren't just building for eyeballs and thumbs anymore.

I love that. Eyeballs and thumbs are so twenty-twenty-four. Now we’re building for tokens and context windows. But before we get into the "robot takeover" of the browser, we should probably look at how we got here. JavaScript has had one of the most chaotic, "ugly duckling" stories in the history of computing. Herman Poppleberry, you’ve been tracking this since the days when people thought jQuery was the pinnacle of human achievement. How do you view this "renaissance" Daniel mentioned?

It’s been a total transformation of identity. If you go back to the early two thousands, JavaScript was a toy. It was for validation forms and annoying pop-ups. But the real pivot point was probably two thousand nine, two thousand ten. You had the birth of Node dot js, which brought JavaScript to the server, and the initial explosion of the npm registry. By twenty-twenty-three, npm hit two million packages. Think about that scale for a second. It’s the largest ecosystem of code ever assembled by humans.

And half of those packages are probably just different ways to center a div or left-pad a string, right? I remember the "left-pad" incident. That was the moment everyone realized our entire digital civilization was held together by three lines of code maintained by a guy in his basement. It’s a bit terrifying when you think about the dependency trees we have now.

It is, but that fragmentation actually drove innovation. We went through the Great Framework Wars—Angular versus React versus Vue. And while people mocked the "framework of the week" fatigue, what was actually happening was a rigorous architectural evolution. We moved from "spaghetti code" manipulating the Document Object Model directly to component-based architectures.

Right, and that’s a crucial point for the agentic discussion. When we moved to components—things like React components or Vue SFCs—we started structuring the web into logical, reusable blocks. Even though we did it to make life easier for human developers, we accidentally made the web much more digestible for machines, didn't we?

Well, I shouldn't say "exactly," but you’ve hit the nail on the head. Component-based architecture created a more predictable structure. When an AI agent looks at a modern web page built with a framework like Next dot js, it isn't just seeing a wall of text. It's seeing a structured tree of components with defined props and states. This is a massive leap from the old days of raw HTML where everything was a nested table and a prayer.

I remember those nested tables. It was like trying to read a blueprint through a kaleidoscope. So, we had this era of "Single Page Applications" where everything happened in the browser, but then the pendulum swung back. Daniel mentioned the "server-side revival" with things like Deno and the evolution of Node. Why did we go back to the server? Was it just because our browsers were melting under the weight of five-hundred-megabyte JavaScript bundles?

Performance was a huge factor, but so was SEO and discoverability. If a search engine—which is essentially the "ancestor" of today's AI agents—can't see your content because it's buried under ten layers of asynchronous JavaScript execution, your site doesn't exist. This led to the rise of Static Site Generators and Server-Side Rendering. Frameworks like Next dot js and Nuxt mastered this "hybrid" approach.

And this brings us to a really interesting case study. Do you remember when Airbnb shifted to React back in twenty-sixteen? At the time, it was a massive undertaking. But the result wasn't just a prettier UI. By moving to a component-based, API-driven architecture, they essentially "unlocked" their data. It made their human UX faster, sure, but it also made their data flows so consistent that it laid the groundwork for what we’re seeing now: agents that can easily navigate a booking flow because the underlying structure is so rigid and logical.

That’s a great example because it shows that "good" development practices for humans often overlap with "good" practices for agents. But we are reaching a point where that overlap isn't enough. In early twenty-six, specifically February, Google shipped that early preview of Web MCP in Chrome Canary. This is a game-changer. It’s a protocol specifically for structured, machine-readable interactions with the web.

So, instead of an agent "guessing" what a button does by reading the text "Submit," the site can explicitly tell the agent, "This is the primary action for placing an order"?

Precisely. Well, not "precisely," but that's the gist! It’s about moving from "scraping" to "interacting." If you’re getting into web development today, you can't just think about how a page looks. You have to think about its "agentic surface area." TypeScript has become the baseline here. Writing plain JavaScript for a professional project in twenty-six is basically considered a legacy approach. TypeScript provides the type safety and schema definitions that agents crave.

It’s funny, I used to tease you about being a TypeScript zealot, Herman. I’d say, "Why do I need to define that this variable is a string? I know it’s a string!" But now I see the light. If the "user" is a large language model, it needs those boundaries. It needs to know exactly what kind of data is expected. Otherwise, it starts hallucinating that your "Add to Cart" button is actually a link to a recipe for banana bread.

And that’s a real risk! Hallucination in programmatic browser use can lead to catastrophic failures—orders being placed incorrectly, data being deleted, or just massive compute waste. This is why the "interactive layer" Daniel mentioned is so critical. We’ve gone from simple scripts to complex "agentic" flows.

Let’s dive deeper into this "Agentic Browser" concept. We’re seeing tools now where a user says, "Go find me the best price on a flight to Tokyo and book it if it’s under eight hundred dollars." The agent then opens a browser, navigates to Expedia or Google Flights, and starts clicking. Herman, as someone who spends way too much time reading technical white papers, what is actually happening under the hood there? Are they just using Playwright or Puppeteer with a brain attached?

In the early days, yes. It was basically an LLM writing scripts for Puppeteer on the fly. But that’s incredibly fragile. If the website changes a class name from "btn-primary" to "button-main," the script breaks. The "Agentic Browser Use" we’re seeing now, particularly with Browser MCP, is much more robust. The agent isn't just looking at the DOM; it’s interacting with the browser's accessibility tree.

Wait, the accessibility tree? You mean the thing we built for screen readers for the blind?

The very same. This is one of those "aha" moments in tech history. For decades, accessibility advocates have been telling us to use semantic HTML—tags like "header," "main," "nav," and "button"—so that screen readers can help visually impaired users navigate. It turns out that an AI agent is, in many ways, very similar to a screen reader. It needs a structured, non-visual representation of the page to understand what’s going on.

So, by being a "good person" and making your site accessible for humans over the last ten years, you were accidentally making it "future-proof" for the AI revolution? That’s a rare win for virtue.

It really is. Semantic HTML is now the most important "SEO" factor for the agentic web. If you use a "div" with an "onclick" event instead of an actual "button" tag, a human might not notice the difference, but an AI agent might completely miss that it’s an interactive element. It’s "invisible" to the machine.

I can see the headline now: "Lazy Developers Accidentally Kill Their Business by Using Too Many Divs." It’s poetic. But what about the more complex stuff? Like, how does an agent handle a multi-step checkout with a "I am not a robot" captcha? That’s the ultimate irony, isn't it? A robot trying to prove it isn't a robot to another robot so it can buy a toaster for a human.

We’re seeing a bit of a cat-and-mouse game there. But more importantly, we’re seeing a shift in how those checkouts are designed. Forward-thinking companies are starting to offer "Agentic APIs" alongside their traditional web views. This is the "Dual-Track API" problem Daniel alluded to. Do you build one site for humans and a separate API for agents?

That sounds like twice the work. And knowing developers, we’re going to find a way to do it once. Is that where things like JSON-LD and Schema dot org come in?

I mean... yes! Schema dot org is the secret sauce. By embedding structured data directly into your HTML using JSON-LD, you’re providing a machine-readable "map" of the content. If I’m an agent looking at a product page, I don't have to "guess" which number is the price and which is the shipping cost. The JSON-LD tells me explicitly: "price is nineteen ninety-nine, currency is USD."

It’s like providing a menu with pictures for someone who doesn't speak the language. It simplifies everything. But let’s play devil’s advocate for a second. If every site becomes perfectly "agent-friendly," do we even need the "web" as we know it? If an agent can just fetch the data and show it to me in a clean interface, why would I ever visit the actual website? Doesn't this destroy the ad-based business model of the internet?

That is the multi-billion dollar question. If users stop visiting "pages" and start receiving "answers" or "actions" from agents, the traditional web economy collapses. This is why we’re seeing a lot of tension right now between content creators and AI companies. But from a purely technical standpoint, it’s pushing us toward an "API-first" world. The website is just one "client" of the data. The AI agent is another.

It’s a bit of a "Frozen Backend" paradox. We spent years making backends dynamic and complex, but now we need them to be incredibly stable and predictable so agents don't get confused. I’m thinking about GitHub. They were an early adopter of this "API-first" mindset. You can do almost anything on GitHub via their API that you can do through the web interface. Because of that, AI agents can automate repository management, pull requests, and issue tracking with almost zero friction.

GitHub is the gold standard for this. And it’s why they’ve been able to integrate things like Copilot so seamlessly. They didn't have to "retrofit" AI; their entire architecture was already "machine-ready." For a developer starting today, that is the lesson: build as if your user is an API call, even if you’re building a GUI.

But how does that work in practice for a junior dev? If I'm building a simple weather app, am I really supposed to build a full REST API and a JSON-LD schema just to show it's raining in Seattle?

Think of it as "progressive enhancement" for robots. You start with a clean, semantic HTML structure. That’s your foundation. Then you add the JSON-LD—which, by the way, is just a script tag in your header. It’s not a separate infrastructure. You're effectively labeling your data. The "API" doesn't necessarily have to be a separate endpoint; your website is the API if it's structured correctly.

Okay, let’s get practical. If I’m a developer and I want to make sure my site isn't just a "black box" to these agents, what am I actually doing on Monday morning? We mentioned semantic HTML and JSON-LD. What else?

You should be looking at Google’s Web MCP validator, which was released in January. It’s a tool that audits your site specifically for "agentic accessibility." It checks if your interactive elements are properly labeled, if your state transitions are clear, and if you’re providing the necessary metadata for an agent to complete common tasks.

Is there a "robots dot txt" for agents? Like, "Hey, you can browse my site, but don't buy all the limited-edition sneakers"?

There is. We’re seeing the emergence of "agentic permissions" in the headers. You can specify which "capabilities" an agent has. Can it read? Can it post? Can it execute financial transactions? This is where the security aspect gets really hairy. If I’m a banking site, I definitely don't want an unauthorized agent "browsing" my way into someone’s savings account.

"Oops, my AI agent accidentally bought a yacht because it misinterpreted a 'Buy Now' button for a 'Learn More' link." That’s going to be the "dog ate my homework" of twenty-twenty-seven.

It’s why "Agentic Behavior Optimization" is becoming a real job title. It’s like SEO, but for agent behavior. You’re testing your site to see how different models—Claude, Gemini, GPT—interact with your forms. Does the model get stuck in a loop on your shipping selection? Does it understand your coupon code logic?

It’s like we’re training the web to be a better teacher for the AI. I find it fascinating that "accessibility" has taken on this new meaning. It used to be about inclusion for people with disabilities—which is still vital—but now it’s also about "inclusion" for the very tools we use to navigate the world. If your site isn't "accessible" to an agent, you’re effectively cutting yourself off from a huge portion of future traffic.

It really changes the "post-human-user" world Daniel mentioned. We might reach a point where ninety percent of "web traffic" isn't human. When that happens, the way we measure "success" has to change. Bounce rates and "time on page" are meaningless if an agent finishes a task in two hundred milliseconds.

"Our site is so good, users only spend zero-point-two seconds on it!" That would have been a failure in twenty-twenty, but in twenty-twenty-six, it might be a key performance indicator. It means your "agentic surface" is perfectly optimized.

And this ties back to the JavaScript evolution. The reason we needed things like TypeScript and modern frameworks was to manage the sheer complexity of these interactions. If we were still writing raw "vanilla" JavaScript without any structure, we’d never be able to provide the stability these agents require.

I want to talk about the "invisible interface" idea. If agents are doing the heavy lifting, does the "frontend" as we know it start to disappear? Do we just end up with a web of "headless" services that only spit out data?

I think we’ll see a split. For "experiential" things—entertainment, social media, shopping for clothes—the human UI will remain critical. We still want to see the "vibe" of a brand. But for "utility" things—paying bills, booking travel, managing insurance—the UI will become secondary to the "Agentic API." You might never see your electricity provider's website; your agent just handles the payment and alerts you if the rate changes.

That sounds like a dream, honestly. I hate paying bills. If a sloth can delegate all his chores to a donkey-coded agent, I’m all for it. But doesn't this put a lot of power in the hands of the browser makers? If Google Chrome is the one "interpreting" the site via Web MCP, they become the ultimate gatekeepers of the internet.

They already are, to a large extent. But by creating an open standard like MCP, they’re at least trying to provide a common language. It’s better than every AI company having to write their own custom "scrapers" for every site on the internet. That would be chaos.

But what about the "dead web" theory? If ninety percent of the traffic is agents, and they're all just talking to each other, doesn't the internet just become a giant, silent data exchange? Where does the human creativity go?

That’s a valid concern. But I’d argue it pushes human creativity into more meaningful spaces. Instead of spending hours designing a slightly more efficient checkout form, designers can focus on storytelling, high-fidelity visuals, and immersive experiences that an agent can't replicate. We’re automating the boring parts of the web so the "human" parts can be more human.

I hope you're right. I’d hate to think the future of the web is just a bunch of JSON files shouting at each other in the dark. But let’s go back to the JavaScript of it all. We’ve seen the rise of "Edge Computing" lately. How does that play into the agentic model?

It’s the "where" of the execution. If an agent is making twenty requests to different services to plan your vacation, you don't want those requests traveling halfway around the world to a central server. You want that logic executing at the "edge"—closest to the data source or the agent itself. Frameworks like Cloudflare Pages or Vercel are basically building the infrastructure for this "distributed" agentic web.

So, for the developers listening, what are the "Actionable Takeaways"? We’ve covered a lot of ground. If you’re building a new project today, what’s the hierarchy of needs?

Number one: Use TypeScript. No excuses. The type safety isn't just for you; it’s for the tools that will eventually read your code. Number two: Semantic HTML is non-negotiable. Use the right tags for the right jobs. Number three: Implement JSON-LD for your core data. If you’re a restaurant, use the "Menu" and "Restaurant" schemas. If you’re a store, use "Product."

And number four: Audit your site from the perspective of an agent. Use tools like the Browser MCP server to see how an LLM "sees" your site. If it can't figure out how to sign up for your newsletter, that’s a bug, even if the "button" looks beautiful to a human.

I’d also add: think about "state" more clearly. Agents struggle with "hidden" state—things that only appear after three clicks and a hover. Try to make your core functionality as "flat" and accessible as possible. If an agent has to play a game of "Where's Waldo" with your "Contact Us" form, it's just going to give up and recommend your competitor.

That’s a scary thought. "I'm sorry, Corn, I couldn't find the 'Cancel Subscription' button because it was hidden behind a parallax-scrolling cat gif, so I just bought you three more years of Cat Fancy magazine."

Predictability is the new "delightful UX." In the twenty-tens, we wanted to surprise and delight users with animations. In the twenty-twenties, we want to be as predictable as a calculator for our agentic users.

It’s almost like we’re going back to the "Simple Web" of the nineties, but with twenty-twenty-six technology. We’re stripping away the "clever" hacks and going back to clear, structured information.

It’s a "Renaissance" in the truest sense—a return to classical principles but with modern power. The "Interactive Layer" is no longer just about making things move; it’s about making things "knowable."

I love that. "Making things knowable." It’s a much higher calling for a JavaScript developer than "making a carousel that nobody clicks on."

Ha! True. And we should mention that this whole ecosystem is moving fast. The "Agentic Internet" we discussed back in some of our earlier research sessions is becoming the "Standard Internet." If you aren't thinking about this now, you’re going to be left behind by the end of the year.

I mean, even the way we handle CSS is changing, right? If an agent doesn't care about colors or fonts, does CSS even matter in an agentic world?

It matters for humans, but for agents, we’re seeing "Utility-First" CSS like Tailwind actually being quite helpful. Because the class names are often descriptive—like "flex," "items-center," "text-red-500"—it gives the agent a hint about the visual hierarchy and importance of an element, even if it doesn't "see" the pixels.

That’s a "fun fact" for the Tailwind haters. It’s not just "ugly" HTML; it’s machine-hinting! I’ll use that the next time someone complains about my long class strings.

It’s all about context. The more context you provide, the better the agent performs. This is the core of the Web MCP philosophy. It’s a "Context Protocol" for a reason.

Well, I’m feeling much more "knowable" already. This has been a deep dive into the guts of the web, and frankly, it’s a lot more interesting than I thought it would be. I guess JavaScript isn't just for annoying pop-ups anymore.

It’s the nervous system of the global brain, Corn. And we’re the ones making sure that brain doesn't have a stroke when it tries to book a flight.

On that note, I think we’ve covered the "what," the "why," and the "how." Any final thoughts before we wrap up, Herman?

Just a reminder that the web has always been about connection. First, it was connecting documents. Then it was connecting people. Now, it’s about connecting intelligences. The code we write is the bridge between our human intent and the machine’s execution. We should build that bridge with care.

Very profound for a donkey. I’m impressed. Let’s head toward the finish line.

We've covered the evolution from toy scripts to agentic interfaces, and hopefully, this gives some clarity to everyone navigating the twenty-six landscape. It's a wild time to be a dev.

It really is. Thanks for the breakdown, Herman. I actually feel like I could write a semantic "button" tag without crying now.

Baby steps, Corn. Baby steps.

Alright, that's a wrap for today. Big thanks to our producer, Hilbert Flumingtop, for keeping the gears turning behind the scenes.

And a huge thanks to Modal for providing the GPU credits that power the generation of this show. We couldn't do this without their serverless infrastructure.

If you’re digging these deep dives into the weird world of prompts and tech, we’d love it if you left us a review on your favorite podcast app. It really helps us find more curious minds like yours.

You can also find all our episodes and the RSS feed over at myweirdprompts dot com.

This has been My Weird Prompts. We’ll catch you in the next one.

Goodbye, everyone.

See ya.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#1767: From Eyeballs to Tokens: The Web's Agentic Shift

Mentions

Downloads

You Might Also Like

#1767: From Eyeballs to Tokens: The Web's Agentic Shift