Here's a question — what if you never logged into a dashboard again? No clicking through menus, no hunting for the right field, no saving and refreshing to see if it took. you tell an agent "update the episode description, change the title to this, and publish it." And it happens.
We actually built something that works this way. Our podcast project — front end, MCP, Claude Code, and an API. No admin backend at all. And the experience forced a question I haven't been able to shake.
Which is what Daniel's asking us about today. He sent us a prompt that basically says — if I've got an API, MCP, an AI agent, and defined skills, do I even need a visual backend like WordPress? His prediction is this becomes the standard way to interact with CRMs, ERPs, and business systems. But he also thinks there'll be a dual-track reality for a long time — people won't delete their backends overnight. The sharper question is what best practices look like for distributed use of agents managing backends, especially in team environments where authentication and federation actually matter.
By the way — today's episode script is coming to us from DeepSeek V four Pro.
Alright, so where do we even start?
I think we start with what agent-first development actually means, because the term gets thrown around and most people picture a chatbot slapped on top of an existing dashboard. That's not what we're talking about. Agent-first means you design the system assuming the primary interface is an AI agent — not a human clicking through a GUI. The API is the real product. The agent discovers what the API can do, understands the operations available, and executes them. The human just...
The thing that makes this more than a thought experiment is that we actually did it. We built a podcast project where there is no admin panel. There's the front end that listeners see, there's the API that serves episodes and metadata, and then there's Claude Code connected through MCP with a set of defined skills — "edit episode title," "update description," "publish episode," "change category." I don't log into anything. I type a sentence and the episode gets updated.
Right, and the distinction that matters here is what MCP — the Model Context Protocol — actually does. Anthropic introduced it in late twenty twenty-four, and it's since been adopted by OpenAI, Google, and Microsoft. It lets an agent discover and invoke API operations without a human ever navigating a UI. Each skill is a typed function with defined inputs and outputs. The agent sees the schema, understands what the operation does, and calls it when the user asks for something that matches.
Compare that to WordPress. If I want to change a post category, I log in, find the post in the list, click edit, scroll to the categories panel, check a different box, scroll back up, click update. That's four clicks, a scroll, and waiting for page loads. With the agent-first approach, I say "move episode twenty-three fifty-nine to the AI category" and it's done. Same API underneath, completely different interaction model.
Here's the part that most coverage misses — this isn't just a convenience thing. It changes what you bother to do. When the friction of a four-click workflow is there, you might not update the category. You might leave the description slightly wrong. When it's one sentence, you just do it. The threshold for maintaining quality drops to basically nothing.
That's the knock-on effect. The admin backend doesn't just slow you down — it filters what you consider worth doing. You make micro-decisions all day about whether a task is worth the clicks. Remove the clicks, and suddenly you're maintaining things at a level of detail that would have been absurd before.
Now, Daniel's prompt points out the current limitation — Claude Code is desktop-centric. It runs in your terminal, on your laptop. You can't pull out your phone and tell it to update an episode while you're walking to get coffee. Skills don't sync across surfaces. Skills uploaded to Claude dot AI have to be separately uploaded to the API. Claude Code skills are filesystem-based and completely separate from both.
The vision works, but only if you're sitting at your computer with a terminal open. Which is fine for developers, but it's not the "from anywhere" future Daniel's describing.
Here's why I think the desktop-centric limitation is temporary. Anthropic's agent SDK and OpenAI's Agents SDK are both adding remote execution capabilities. Composio — the integration backbone for a lot of this — already handles OAuth lifecycle management and scoped credential isolation per agent and per environment, with over eight hundred fifty SaaS app integrations. The pieces for location-agnostic agents are being assembled right now.
Your prediction is — what, by late twenty twenty-six we'll see agents embeddable in Slack, mobile apps, browser extensions?
I think by late twenty twenty-six we'll see agents that can be invoked from anywhere. And once that happens, the question Daniel's asking becomes urgent. If I can manage my entire podcast from my phone by talking to an agent, why would I ever build an admin dashboard?
You wouldn't build one for routine operations. But that's where the dual-track reality comes in. You don't delete the backend — you repurpose it. It becomes the debugging tool, the audit log viewer, the place you go when the agent does something unexpected and you need to understand why.
That's exactly what the enterprise data is showing. Gravitee put out a State of AI Agent Security report in February — they surveyed over nine hundred executives and practitioners. Eighty-one percent of teams are past the planning phase with AI agents. But only fourteen point four percent have full security approval. So what's happening in practice is organizations are running both — the traditional backend for governance and compliance, and the agent interface for speed and convenience. That dual track Daniel predicted? It's not a future thing. It's the messy present.
That fourteen percent number is wild. It means the vast majority of agent deployments are operating without full security sign-off. Which tells you the productivity gains are so compelling that teams are willing to accept security debt to get them.
Or they're just not thinking about what happens when something goes wrong. And this is where the "better question" Daniel asked comes in — what are the best practices for distributed use of agents managing backends in team environments?
Let's sit with that, because it's the crux of the whole thing. Single-user agent workflows are basically solved. I tell Claude Code to edit an episode, it edits the episode. But what happens when there are three editors on the podcast, and one of them can publish but the other two can only draft? The agent needs to carry identity context through every operation. If it doesn't, you've got a governance nightmare.
The numbers on this are alarming. That same Gravitee report found that only twenty-one point nine percent of teams treat AI agents as independent, identity-bearing entities. Forty-five point six percent — nearly half — are still relying on shared API keys for agent-to-agent authentication. Twenty-seven percent have reverted to custom hardcoded logic to manage authorization. This is the identity crisis that nobody's talking about.
Shared API keys. So the agent is basically operating with a master key to the kingdom, and if you've got multiple people using it, there's no way to tell who did what.
And Daniel's podcast project is the perfect microcosm. If he's the only one using Claude Code to manage episodes, the shared key problem is manageable. But scale that to a team of five, or a company of five hundred, and suddenly you need to know: who initiated this publish action? Which agent executed it? What permissions did it have? Can we revoke access for one person without breaking everyone else's workflow?
This is where the industry is going to hit a wall if they don't solve the federation problem. By federation I mean an agent that can act across multiple systems. Say you want an agent to handle "create a sponsor invoice." That's a CRM lookup for the sponsor contact, a billing system call to create the invoice, and an email system call to send the notification. The agent needs scoped access to all three, and those three systems probably have completely different authentication mechanisms.
The cross-system permission composition problem is largely unsolved. OAuth two point zero device authorization flow — RFC eight six two eight — is the closest standard we have for agent authentication, but it lacks agent-specific scoping. It was designed for devices like smart TVs and IoT gadgets, not for autonomous agents acting on behalf of a user across multiple services.
We're in this weird moment where agent capabilities are racing ahead, and the identity layer is still catching up. NIST has launched an AI Agent Standards Initiative specifically focused on agent identity, authorization, and security. And there's an IETF draft called AI-Auth that addresses agent authorization on behalf of users and systems, with federation support. The standards work is happening — it's just not done yet.
Which means anyone building agent-first systems right now is essentially rolling their own auth patterns. That's fine for a podcast project with one or two people. It's terrifying for a company with regulatory requirements.
Alright, so let me pull this together. Daniel's core prediction — that agent-first becomes the standard way to interact with business systems — I think that's directionally right. The friction reduction is too significant to ignore. But the bottleneck isn't the agent technology. It's the identity and authorization layer. Until that's solved, we'll see exactly the dual track he described: agents for speed, traditional backends for governance.
The podcast project we built is a proof of concept for the single-user case. The interesting question — and I think this is where Daniel was nudging us — is what happens when you try to scale that pattern to a team. That's where the real design work begins.
Let's go there. What does it actually look like when multiple people use agents to manage the same system?
I want to start with something concrete. In our podcast project, when I tell Claude Code "edit episode twenty-three fifty-nine and update the description," what's actually happening? The MCP server exposes a set of tools — each one is a typed function. There's "update episode metadata" that takes an episode ID and a set of fields to change. There's "publish episode" that takes an episode ID and changes its status. Claude Code sees these tools, understands their schemas, and when I give it a natural language instruction, it maps my intent to the right tool call.
The key word is "understands." The agent isn't just pattern-matching my words to API endpoints. It's reasoning about what I'm asking for and selecting the right sequence of operations. If I say "fix the typo in the show notes for the latest episode," it has to figure out which episode is latest, retrieve its show notes, identify the typo, and call the update function with the corrected text. That's a multi-step reasoning chain.
This is where MCP is genuinely different from REST or GraphQL. Those were designed for human developers to read documentation and write integration code. MCP is designed for agents to discover capabilities and compose tool calls dynamically. The agent doesn't need to be pre-programmed with knowledge of every endpoint. It discovers what's available and figures out how to use it.
The agent skill becomes the new "admin page." Instead of building a form with fields for title, description, category, and status, you define a skill that knows how to update those fields through the API. The skill has instructions, input schemas, and access to the right API endpoints. The user never sees any of it.
Here's what's interesting about the Claude Code skills architecture. They use a three-tier progressive disclosure model. The metadata — about a hundred tokens — is always loaded so the agent knows what skills are available. The full instructions, up to five thousand tokens, only load when the skill is triggered. Scripts and resources load on demand via bash, never entering the context window. This means you can have dozens of skills installed without a meaningful context penalty.
Which is how you scale this beyond a handful of operations. If every skill bled its full instructions into the context window, you'd max out after five or six skills. With progressive disclosure, you can have an entire admin backend's worth of operations and the agent stays performant.
This connects to something Salesforce leaders were saying at an executive roundtable in December. They predicted that agent-first experiences will reset expectations for ERP usability. Their exact phrase was that employees will grow less tolerant of form-heavy, manual ERP workflows as agents become the primary interface layer. Once you've experienced "update the inventory count for product X" instead of navigating six screens, you don't want to go back.
I think that's the emotional core of this shift. It's not just efficiency. It's that traditional admin backends feel increasingly archaic once you've used the alternative. The same way you don't want to go back to dial-up after broadband.
Voice is going to accelerate this. The Salesforce roundtable identified voice as the first channel where agentic AI will force large-scale replatforming. Because once customers experience conversational service that actually works — "what's my order status?" answered instantly by an agent that can query the backend — the systems behind that experience have to respond in real time. Voice surfaces data fragmentation and integration gaps faster than any other channel.
You've got pressure from multiple directions. Text-based agents like Claude Code are proving the pattern for developers and power users. Voice agents are going to bring that expectation to consumers and frontline employees. And the traditional admin backend starts looking like a legacy compatibility layer rather than the primary interface.
Which brings us back to Daniel's question about the dual track. I don't think anyone is arguing that WordPress-style backends disappear overnight. But the role shifts. The backend stops being where you do your daily work and becomes where you go to fix things, audit things, and handle exceptions. It's the service manual, not the steering wheel.
That distinction matters for how you architect systems going forward. If you're building a new application today and you know the primary interface is going to be an agent, you design the API differently. You think in terms of semantic operations — "publish episode," "archive customer," "generate invoice" — rather than CRUD operations on database tables. The API becomes a set of meaningful actions, not just data access points.
This is where the actionable advice starts to crystallize. If you're a developer or a technical founder listening to this, audit your current admin backend. Identify the top ten operations you do daily. Not the database operations — the real tasks. "Update a customer's billing address." "Change an order status." "Add a note to a support ticket." Then ask yourself: could an agent do this with a single natural language command? If the answer is yes, start building the MCP skills for those operations today.
Build them with identity propagation from day one. This is what will save you when you scale from one user to a team. Every API call from an agent should carry the identity of the human who initiated it. Not the agent's identity — the human's. Because when something goes wrong six months from now and you're staring at an audit log, you need to know who asked for what, not just which agent executed it.
That's the pattern Microsoft is pushing with their Foundry platform. They're using agent identity blueprints in Entra ID — provisioning agents as first-class security principals, assigning role-based access control, and using token-based authentication for tool access. The agent has its own identity, but it acts on behalf of a human, and both identities are captured in the audit trail.
You've got a chain of accountability — human to agent to API operation. And if you're doing this across multiple systems, that chain needs to be preserved everywhere.
Let's make this concrete with a team scenario. Say we've got three editors on the podcast — you, me, and Daniel. Daniel can publish episodes. You and I can only draft. When Daniel tells the agent "publish episode twenty-three fifty-nine," the agent needs to check: is the human making this request authorized to publish? It can't just check its own permissions — it needs to know Daniel's role and enforce it. If you tell the agent the same thing, it should respond with "you don't have publish permissions, but I can save this as a draft.
That's a fundamentally different architecture than the single-user case. In the single-user case, the agent just has full access and does whatever you ask. In the team case, the agent becomes a permission enforcement point. It has to understand roles, check authorization on every operation, and refuse requests that exceed the user's scope.
This is where I see two architectural patterns emerging. One is the gateway pattern — a single agent gateway that sits in front of all your systems, handles authentication and authorization centrally, and proxies requests to the appropriate backends. The advantage is unified auth and a single audit trail. The disadvantage is it becomes a single point of failure and a bottleneck for permission complexity.
The other pattern?
The federated pattern. Each system manages its own agent credentials and authorization. The agent carries per-system tokens and negotiates access independently with each backend. The advantage is that each system's security model stays independent. The disadvantage is that composing permissions across systems becomes the agent's problem, and audit trails get fragmented.
I suspect most organizations will end up with some hybrid. A gateway for the systems they control directly, and federation for third-party services. But the key principle either way is that the agent's authority is always scoped and auditable.
Scoping isn't just about what the agent can read or write. It's about what it can do without human approval. Destructive operations — delete, publish, bill, send — those should have approval gates. The agent should be able to say "I've prepared the invoice, here's a preview, do you want me to send it?" rather than just firing it off.
That's the human-in-the-loop pattern, and it's essential for building trust. People won't adopt agent-first workflows if they're worried the agent is going to do something irreversible without asking. The approval gate doesn't have to be slow — it can be a one-click confirmation — but it needs to exist for anything with real consequences.
This is where the traditional admin backend finds its new role. The approval gate is the backend. When the agent says "I need confirmation to publish this episode," the human opens the backend, reviews what the agent has prepared, and clicks approve. The backend becomes the verification and exception-handling layer.
Let me pull back to the bigger framing. Agent-first development — what does that phrase actually mean beyond the buzzword? I'd say it's the inversion of how we've built software for forty years. Every system assumed a human would sit down, log in through a browser, look at a screen, and click things. The GUI was the interface. Everything else — the API, the database, the business logic — was infrastructure behind the interface. Agent-first flips that. The API and the skills layer become the interface. The GUI, if it exists at all, becomes one of several possible access points, and not the primary one.
The distinction that matters is intentional design versus accidental retrofitting. You can bolt a chatbot onto WordPress today. Salesforce did exactly that with Einstein GPT back in twenty twenty-three — it's a conversational layer sitting on top of the existing UI. But that's not agent-first. That's agent-last. You built the GUI, then the API, then wrapped a chatbot around it. Agent-first means you design the API and the skill schemas before you even think about whether there's going to be a visual interface.
Which is what we stumbled into with the podcast project, honestly. We didn't set out to make a philosophical statement. We just wanted to build fast, and Claude Code plus MCP plus an API was the fastest path. The fact that we never built an admin panel wasn't ideological — it was practical. And then we looked around and realized we hadn't missed it.
That's the test. If you remove the admin backend entirely for a week, do you feel the absence? For a lot of routine operations, the answer is increasingly no. But Claude Code runs on a laptop. If I'm on my phone and want to update an episode description, I can't use Claude Code. I'm back to needing some kind of interface.
The desktop-centric limitation isn't just a minor inconvenience. It's the thing that prevents agent-first from being the default. As long as the agent is tethered to a specific device and context, it can't be the primary interface. You still need a fallback for every other context.
This is exactly why Daniel's prediction about frameworks becoming location-agnostic is the key unlock. The moment agents can be invoked from Slack, from a mobile app, from a browser extension, from a voice assistant — the admin backend stops being the convenient option and starts being the slow option. Why would I open a browser, navigate to a URL, log in, find the right page, locate the field, type, and save, when I can just say "update the episode description" from wherever I happen to be?
There's an unspoken assumption there, though. It assumes the agent knows what "the episode" means — which project, which environment, which context. On a desktop, that context is implicit — you're in a terminal, in a project directory. When you're location-agnostic, the agent has to reconstruct that context from scratch every time. That's a harder problem than it looks.
This is where MCP becomes more than just a tool protocol. It becomes a context protocol. The MCP server for your podcast project doesn't just expose operations — it maintains the project's state and context. When an agent connects to it from anywhere, the server provides the context. The agent doesn't need to be in a specific directory on a specific machine. It just needs to be able to reach the MCP server.
The stack is API plus MCP plus skills, accessible from any surface that can make network requests. The desktop becomes just one of many possible clients. And once that's true, the visual admin backend becomes — to use your phrase — the service manual, not the steering wheel.
I think the timeline on this is compressed. Anthropic's agent SDK and OpenAI's Agents SDK are both adding remote execution capabilities. Composio is already handling OAuth lifecycle management across eight hundred fifty plus SaaS apps. By the end of this year, embeddable agents will be the norm, not the exception. And once that infrastructure is in place, the question shifts from "do we need a backend?" to "what do we still need a backend for?
Which is the dual-track reality Daniel described. You keep the backend for complex multi-step configuration, visual data exploration, debugging unexpected agent behavior. And you use the agent for everything else. The backend doesn't die. It just stops being where you live.
That raises a question I keep coming back to. If attention shifts far enough, do we eventually reach a point where a new hire's first action is "grant agent access" instead of "create a login"?
That's the litmus test. When onboarding someone new, what's the first thing you give them? Today it's a username, a password, and a list of bookmarks to dashboards they'll spend six months learning to navigate. If agent-first becomes the default, the first thing you give them is access to an agent that already knows how to navigate those systems. They don't need to learn where the publish button lives. They just need to know what they're allowed to do.
The admin backend doesn't die in that world. It becomes a debugging tool. You open it when the agent does something unexpected. When a draft gets published prematurely or an invoice goes to the wrong client, you don't debug by asking the agent what happened — you open the dashboard, check the audit trail, inspect the state. The backend becomes the place you go for forensics, not for routine work.
Which inverts the relationship entirely. Right now, the backend is the primary interface and the agent is the experimental add-on. The agent is where you live, and the backend is where you go when something smells wrong. Like how most people don't open their car's engine unless there's a warning light.
The best interface is the one you don't have to think about. That's been true since Doug Engelbart's mother of all demos in nineteen sixty-eight. Every generation of software has moved one layer closer to that ideal — command line to GUI, GUI to touch, touch to voice and agents. Each step removes a layer of conscious navigation. Agents are the next layer. The question isn't whether the technology works. It's whether our systems — our APIs, our identity infrastructure, our audit practices — are ready for what agents can already do.
That's the tension we've been circling. The agent capabilities are racing ahead. MCP, agent skills, composable tool discovery — all of it works today. But identity propagation, cross-system authorization, audit trails that capture agent-initiated actions — that stuff is still held together with shared API keys and good intentions. The technology to replace admin backends exists. The governance to do it safely in a team environment is still being invented.
NIST launched an AI Agent Standards Initiative specifically for this reason. The IETF is working on an AI-Auth draft specification for agent authorization across federated systems. The standards bodies are moving because the gap between what agents can do and what organizations can safely allow them to do is wide enough to be dangerous.
The prediction Daniel started with — that agent-first becomes the standard way to interact with business systems — is directionally right. But the timeline depends less on the agent technology and more on the boring stuff. The organizations that solve those problems first will be the ones where agents actually replace dashboard work. Everyone else will be stuck in dual-track purgatory, watching their agents get faster while their governance stays stuck.
Which is honestly a pretty exciting place to be if you're building systems right now. The people who figure out agent identity and cross-system authorization aren't just solving a security problem. They're building the infrastructure that determines whether agent-first becomes the default in two years or ten.
Now, Hilbert's daily fun fact.
The collective noun for a group of porcupines is a prickle.
Here's what I'm left thinking about. We've spent this episode talking about whether agents replace backends. But the deeper question is whether the concept of a "backend" — a separate space where administrators go to manage things — survives at all. Maybe in ten years, the idea of logging into a dashboard to change a setting feels as archaic as editing a config file by hand feels today.
Or maybe the backend just becomes invisible. Not gone — just so thoroughly mediated by agents that nobody thinks about it. Like how DNS is critical infrastructure that almost nobody interacts with directly. It just works. The agent becomes the DNS of business operations. You don't configure it. You just tell it what you want.
Either way, the people building agent-first systems today are running the experiment that answers this question. And if you're listening and you run a system with an admin backend, you can run a version of that experiment yourself. Pick your top five routine operations. Build MCP skills for them. Give yourself agent access. See how long it takes before you stop logging in.
Thanks to Hilbert Flumingtop for producing, and to Daniel for the prompt that got us here.
This has been My Weird Prompts. If you're enjoying the show, head over to myweirdprompts dot com and sign up for the newsletter. We'll be back soon.