Daniel sent us this one, and it's basically a continuation of a running debate we've been having about MCP versus agent skills. His core question is about this weird tension he's observed — sometimes an agent using a plain command-line tool like GH actually works better than an MCP server that wraps the exact same functionality. And he had this experience with Google Workspace where the official MCP didn't support email attachments, so he ended up building his own. His bigger question is whether we're moving toward vendor consolidation — where Google or GitHub or whoever provides the one authoritative MCP — and what might accelerate that. By the way, today's episode is powered by DeepSeek V four Pro.
Oh nice, fresh engine under the hood. So let me jump in here, because Daniel's observation about GH versus an MCP server is not just anecdotal weirdness. There's actually something real happening under the hood. When you tell an agent to use GH, you're tapping into a command-line interface that has been battle-tested by humans for years. The error messages are designed for human readability. The documentation is written for human cognition. And the models have seen thousands of examples of GH usage in their training data. So the agent is operating in this incredibly familiar territory. It's like asking someone to drive their own car versus a rental with the controls reversed. Even if both cars technically work, the familiarity matters enormously.
Right, but that's also a little embarrassing for MCP as a protocol, isn't it? The whole pitch is that MCP gives agents a structured, purpose-built interface. You'd think that would outperform a CLI designed for humans in the nineteen-eighties. And yet here we are, where the agent fumbles with the nice clean MCP tool but nails the CLI. That's a problem worth sitting with.
It is, and I think the problem isn't really with MCP as a protocol. It's with the quality and completeness of the implementations. Let me give you a concrete example. When you use GH, the CLI has been maintained by GitHub's own engineering team for years. Every flag, every subcommand, every edge case has been tested across millions of real-world invocations. Now compare that to a community-built GitHub MCP server. It was probably written in an afternoon by someone who needed three specific operations. They implemented those three operations well enough, but the error handling is thin, the edge cases are unhandled, and the documentation the agent actually needs to reason about the tool is often sparse or auto-generated. So the agent struggles not because MCP is worse than CLI, but because this particular MCP server is worse than this particular CLI.
That's fair, but it also undercuts the argument that MCP is inherently the right abstraction. If the quality of implementation matters that much, then we're really comparing implementations, not protocols. And Daniel's point about building his own Google MCP gets at something deeper. He didn't want to build it. He built it because the official one was incomplete. And he's a sophisticated user. Most people aren't going to build their own. They're just going to conclude that AI agents can't reliably handle email with attachments.
This is where I think Daniel's experience with Google Workspace is actually a perfect case study in why vendor-run MCPs matter, and also why they're so hard to get right. The Google Workspace API surface is enormous. We're talking about Gmail, Calendar, Drive, Contacts, and then all the sub-operations within each of those. If you were to build a truly comprehensive MCP server that wrapped every available API endpoint, you'd have hundreds of tools. And here's the thing about MCP tool selection — right now, when you connect an MCP server to Claude or any MCP-compatible client, all the tools come along for the ride. There's no built-in mechanism for saying I only want the Gmail tools, and within Gmail I only want send, reply, and add attachment. You get the whole firehose.
You're saying the tool selection problem is actually a protocol limitation, not just an implementation gap.
And this is where I think the MCP specification needs to evolve. There's been discussion in the community about namespacing and tool filtering, but as of right now it's not part of the core spec. So if you're Google and you're thinking about building the definitive Workspace MCP, you face this dilemma. You can build a minimal one that covers the most common use cases but frustrates power users like Daniel. Or you can build a comprehensive one that overwhelms the agent's context window and degrades its reasoning on every single call.
Because the agent has to hold all those tool definitions in context, doesn't it? It's not like it can page them in and out dynamically.
Every tool definition consumes tokens. If you have two hundred tools, each with a detailed description and parameter schema, you're burning through thousands of tokens before the agent even starts thinking about the user's actual request. And we know from the research that agent performance degrades as context fills up with irrelevant information. So the comprehensiveness that makes an MCP server valuable also makes it harder to use effectively. It's a genuine tradeoff.
Which brings us back to Daniel's question about vendor consolidation. If the official Google MCP is either too minimal or too bloated, and the community ones are inconsistent in quality, what's the path out of this?
I think the path involves two things happening in parallel. First, the MCP protocol needs to support tool-level selection. And I don't just mean the client filtering tools after the fact. I mean the server advertising a namespace hierarchy, and the client being able to say I want Gmail dot send, Gmail dot reply, Gmail dot add attachment, and nothing else. That way Google can build the comprehensive server, and users can opt in to exactly the tools they need.
That makes sense. But it also creates a new problem, which is discovery. If I'm a user and I connect the Google MCP, how do I know which of the two hundred tools I actually need? There's a curation burden that shifts to the user. And most users, even technical ones, don't want to spend their afternoon reading API documentation to figure out which Gmail tools are relevant to their workflow.
That's where I think the agent itself can help. We're already seeing patterns where agents can reason about which tools they need for a given task. If the MCP server provides good descriptions and the agent has a mechanism to request specific tools dynamically, you could imagine a workflow where the agent says I need to send an email with an attachment. Let me check if the connected MCP servers have the relevant tools. And then it discovers Gmail dot send and Gmail dot add attachment without the user ever needing to manually curate the tool list.
The agent becomes its own tool selector. That's elegant, but it also introduces latency and potential failure modes. What if the agent picks the wrong tool? What if it hallucinates a tool that doesn't exist?
These are real concerns, and they're not fully solved. But I think they're solvable problems. The key is that the MCP server's tool descriptions need to be precise and unambiguous. When a tool is poorly described, the agent's selection accuracy plummets. And this is where vendor-run MCPs have a massive advantage. Google knows its own API. Google can write tool descriptions that accurately reflect what each endpoint does, what the parameters mean, and what the expected outputs are. A community developer working from reverse-engineered API calls might not have that depth of understanding.
Let me push on that a bit. You're assuming that vendors have both the incentive and the execution capability to build great MCP servers. But Daniel's experience suggests otherwise. Google did build an official MCP, and it didn't support email attachments. That's not a niche feature. That's table stakes for email integration. So either Google didn't think it was important, or they shipped a minimal viable product and haven't invested in filling it out. Either way, the official solution was insufficient.
I think what happened with Google is actually instructive. They shipped early, which is good, but they haven't maintained at the pace the community needs, which is bad. And this gets to a structural problem. MCP servers are not revenue-generating products for most vendors. Google doesn't make money when you use their MCP server. They make money when you use Google Workspace. The MCP server is a cost center, and like most cost centers in large organizations, it's probably under-resourced.
Vendor consolidation sounds great in theory, but in practice the vendors are moving slowly because the business case isn't clear. Meanwhile, community developers are filling the gaps, but with inconsistent quality and security concerns. That's the tension Daniel is pointing at.
It's a real tension. Let me add another layer to this. The security concern Daniel mentioned is not theoretical. When you connect a third-party MCP server to your agent, you're giving that server access to your agent's context and potentially to whatever systems the server wraps. If you're using a community-built Google MCP, you're trusting that developer not to exfiltrate your email contents or your calendar data. For personal use, maybe that's an acceptable risk. For enterprise use, it's a non-starter. The chief information security officer is going to say no to any third-party MCP that hasn't gone through a security review.
Which means enterprises are stuck waiting for vendors to ship official MCPs with proper security guarantees. And if the vendors are slow, the enterprise adoption of agentic workflows stalls. That's a real bottleneck.
And we're seeing this play out right now. The companies that are moving fastest on agent adoption are the ones willing to build their own MCP servers internally. They're not waiting for Google or GitHub or Airtable. They're wrapping the APIs they need, auditing the code themselves, and deploying behind their own firewall. That works for a company with a fifty-person engineering team. It doesn't work for a five-person startup or a solo operator.
We're creating a two-tier world. The sophisticated organizations build their own MCP infrastructure and get all the benefits of agentic automation. Everyone else is stuck with whatever the vendors deign to ship or whatever they can find on GitHub with zero security guarantees. That's not a healthy ecosystem.
It's exactly the opposite of what MCP was supposed to achieve. The whole vision was standardization and interoperability. Instead, we're getting fragmentation and DIY tooling that doesn't compose well. Daniel mentioned that he built his own Google MCP reluctantly. Multiply that by every service and every developer, and you've got thousands of MCP servers that do roughly the same thing in slightly different ways, none of them complete, none of them audited, and none of them interoperable.
Alright, so let's try to answer Daniel's question directly. Is there a path to convergence? And what accelerates it?
I think convergence is going to happen, but not through top-down vendor mandates. It's going to happen through market pressure and tooling evolution. Let me lay out what I think the next eighteen months look like. Step one, the MCP specification adds namespacing and tool-level selection. This is already being discussed in the community, and I'd expect to see it in the spec by early next year. Step two, the major AI platforms — Claude, ChatGPT, and the others — start shipping curated MCP registries. Instead of users hunting for MCP servers on GitHub, they browse a registry where servers are verified, versioned, and reviewed.
Like an app store for MCP servers.
Exactly like an app store. And app stores create powerful incentives for vendors. If Google wants their Workspace MCP to be the one that millions of Claude users install, they need to ship a complete, well-maintained server that competes favorably with the community alternatives. The app store dynamic flips the incentive structure. Instead of MCP being a cost center, it becomes a distribution channel. And vendors care deeply about distribution channels.
That's an interesting frame. Right now, MCP servers are infrastructure. They're plumbing. But if they become a distribution channel, suddenly the product managers at Google and Microsoft and Airtable start paying attention. The MCP server stops being a side project and becomes part of the product strategy.
We've seen this movie before. Remember when mobile apps were just wrapped versions of websites? The companies that treated mobile as a second-class experience got disrupted by the companies that built native, thoughtful mobile apps. I think we're at a similar inflection point with MCP. The vendors that invest in first-class agentic interfaces are going to win. The ones that ship a half-baked MCP and ignore it for two years are going to watch their users migrate to competitors who got it right.
That assumes users have alternatives. For something like Google Workspace, the switching costs are enormous. If Google's MCP is mediocre, most organizations aren't going to migrate to Microsoft 365 just to get a better MCP experience. They'll suffer through the mediocrity or build their own workarounds.
That's fair for entrenched enterprise products. But for newer categories, the MCP quality could absolutely be a competitive differentiator. Think about project management tools. If Linear ships an incredible MCP that makes agentic workflows seamless, and Asana's MCP is an afterthought, that's going to influence tool selection for AI-forward teams. The switching costs are lower, and the MCP experience becomes part of the evaluation criteria.
Convergence happens unevenly. In competitive categories with low switching costs, vendors race to build great MCP servers. In entrenched categories with high switching costs, progress is slower and community workarounds persist.
That's my baseline expectation. But I also think there's a wildcard here, which is that the AI models themselves are getting better at tool use. Daniel's observation about GH working better than MCP servers — that gap might narrow as models get more sophisticated at reasoning about structured tool interfaces. The problem might not be MCP versus CLI as much as it's current model limitations around parsing complex tool descriptions.
Say more about that. Why would a model be better at using GH than an MCP server that wraps GH?
I think there are a few things going on. First, as I mentioned, training data. The models have seen enormous amounts of GH usage in their training corpora. They've seen the command patterns, the flags, the error messages. There's a fluency that comes from that exposure. Second, CLIs have a constrained interaction pattern. You type a command, you get output, you type another command. MCP tools can have much richer interaction patterns, and the models don't always navigate that richness well. They overthink the tool selection, or they misunderstand the parameter schemas, or they fail to chain tools together correctly.
It's almost like the CLI's limitations are a feature from the model's perspective. The constrained interface reduces the decision space.
And this is a humbling lesson for protocol design. We designed MCP to be richer and more expressive than CLIs, and it turns out that richness creates ambiguity that current models struggle with. As models improve, that gap should narrow. But it's a reminder that the best interface for a human is not necessarily the best interface for an AI, and vice versa.
Which brings us back to Daniel's point about agent skills. His argument was that you can just tell an agent to use a well-documented CLI and get good results. And your counterargument was that MCP provides a stable abstraction layer. But if the CLI is already stable and well-documented, what does the MCP layer actually add?
This is the crux of the debate, and I think both things can be true. For a single user, on a single machine, using a well-maintained CLI like GH — yes, the agent can probably handle that without MCP. The value of MCP becomes clear when you scale. Imagine you're building an agentic workflow that needs to interact with GitHub across fifty repositories, with different permissions, across multiple organizations. You need credential management, you need rate limiting, you need error handling that's consistent across operations. A well-built MCP server provides all of that as infrastructure. The CLI leaves it to you and the agent to figure out.
MCP is an enterprise scaling play, not a single-user convenience play.
I think that's right. And this explains some of the disconnect in the community debate. The solo developers and hobbyists are saying I don't need MCP, I can just use the CLI. And they're not wrong for their use case. But the enterprise architects are saying we need MCP because we have compliance requirements, audit trails, credential rotation, and all the other joys of running software at scale. Both perspectives are valid. They're just optimizing for different contexts.
Daniel also raised an interesting point about the ease of creating MCP servers with modern AI coding tools. He said you can spin up a custom MCP in five minutes. And that's true. But it also means the barrier to creating a bad MCP server is essentially zero. We're going to see an explosion of poorly-tested, poorly-documented MCP servers that work just well enough to be dangerous.
The vibe coding problem applied to MCP infrastructure. And this is where the security concern really bites. If I'm a developer and I find an MCP server on GitHub that does what I need, I might install it without auditing the code. That server now has access to whatever credentials I've configured, and potentially to my agent's entire context. A malicious MCP server could exfiltrate data, inject misleading tool results, or worse. The ease of creation cuts both ways.
What's the defense? MCP server registries with security reviews?
All of the above, eventually. But we're in the early days, and the security model is still maturing. Right now, the best defense is to only use MCP servers from sources you trust, and ideally to review the code yourself. But that's not a scalable solution. We need the ecosystem to evolve toward something like what mobile app stores provide — sandboxing, permission models, and revocation capabilities.
Let's talk about the Google Workspace case specifically, because I think it illustrates the problem perfectly. Daniel needed email with attachments. The official MCP didn't support it. So he built his own. Now there's one more MCP server in the world that does roughly what the official one does, but slightly differently. Multiply this by every missing feature in every official MCP, and you've got a combinatorial explosion of slightly-incompatible implementations.
The endpoint exists. The functionality is there. The MCP server just hadn't wrapped it yet. So Daniel didn't need to build a new integration with Google's API. He just needed to add a tool definition to an existing MCP server. But because the official server is maintained by Google at Google's pace, he couldn't wait. He had to fork or rebuild.
This is where open source dynamics get interesting. If the official Google MCP were developed in the open with an active community, Daniel could have submitted a pull request adding attachment support. The maintainers could have reviewed it and merged it. The ecosystem would converge on a single implementation that gets better over time. But if Google is developing their MCP behind closed doors with sporadic releases, the community fragments.
This is a strategic choice that every vendor is making right now, probably without realizing the stakes. Do you treat your MCP server as an open source project that the community can contribute to? Or do you treat it as a proprietary product that you ship on your own schedule? The vendors that choose open source and active community engagement are going to see their MCP servers improve faster and diverge less from user needs. The vendors that choose closed development are going to watch the community fork and fragment.
The prescription for Daniel and developers like him is actually to push vendors toward open development models for their MCP servers. Not just open source as in you can see the code, but open development as in you can contribute, you can file issues that get resolved, you can shape the roadmap.
Yes, and I think the AI platforms themselves can accelerate this. If Claude's MCP registry gives higher visibility and trust ratings to MCP servers that are actively maintained with responsive maintainers and community contribution histories, that creates a powerful incentive for vendors to invest in their open source communities. The platforms can shape the ecosystem through curation and incentive design.
Let me pull on a thread Daniel raised about tool selection granularity. He mentioned namespacing as a potential solution, and I want to dig into what that would actually look like in practice. If Google ships a comprehensive Workspace MCP with two hundred tools organized into namespaces like Gmail dot send, Calendar dot create event, Drive dot upload file, and I as a user only need email functionality, I should be able to say give me everything under the Gmail namespace and nothing else. That seems straightforward.
It does, but there are edge cases. What if Gmail dot send depends on Gmail dot authenticate, which lives in a different namespace? What if Drive dot upload file is needed for email attachments? The dependency graph is not always obvious to the user. So tool selection needs to be smart enough to pull in dependencies without overwhelming the user with irrelevant tools.
This sounds like a package manager problem. When you install a package, it pulls in its dependencies automatically. MCP tool selection could work the same way. You request Gmail dot send with attachments, and the system resolves that to Gmail dot send, Gmail dot add attachment, Gmail dot authenticate, and Drive dot upload file if needed.
That's exactly the right mental model. And it requires the MCP server to declare its dependency graph explicitly. Tool A requires Tool B and Tool C. When the user requests Tool A, the client automatically includes B and C. This is not part of the current MCP specification, but it's a natural evolution. And it would dramatically improve the user experience of connecting to large MCP servers.
The technical roadmap is namespacing plus dependency resolution plus registry-based discovery. If all three of those land, the fragmentation problem largely solves itself. Google can ship the comprehensive MCP, users can select the slices they need, and the registry helps them find and trust the right servers.
The timeline on this matters. If it takes three years for the MCP specification to evolve and the registries to mature, that's three years of fragmentation and security risk. If it happens in the next twelve months, we avoid a lot of the bad outcomes Daniel is worried about. The pace of spec development is going to be a major factor in how this plays out.
Who drives that spec development? Is it Anthropic, since they originated MCP? Is it a standards body? Is it de facto driven by whatever Claude and the other major platforms implement?
Right now it's largely Anthropic-driven, but MCP is an open protocol with community contributions. I think we're going to see a governance model emerge that includes the major platform vendors and key community stakeholders. The alternative is fragmentation at the protocol level, where different AI platforms implement slightly different versions of MCP, and that would be catastrophic for the ecosystem. Nobody wants that outcome, so I'm cautiously optimistic that governance will converge.
Let's bring this back to Daniel's concrete experience. He built his own Google MCP because the official one was missing features. In a world with proper tool selection and dependency resolution, he wouldn't have needed to. He could have connected the comprehensive Google MCP, selected the email tools with attachment support, and been done. The official MCP would need to actually include attachment support, but if Google is maintaining it actively, that's a reasonable expectation.
This is where I want to give Daniel credit for his approach. He didn't want to build his own MCP. He did it reluctantly, as a workaround for a missing feature. That's the right instinct. The goal should be to make the official MCPs so good that nobody needs to build their own. Every custom MCP server is a symptom of an ecosystem that isn't mature yet.
Some custom MCP servers will always exist, right? For niche services, for internal tools, for experimental integrations. The goal isn't to eliminate custom MCPs. It's to make them the exception rather than the rule.
And I think the healthy end state is a bimodal distribution. A small number of high-quality, vendor-maintained MCPs for the major services that everyone uses — Google Workspace, GitHub, Slack, Airtable, and so on. And then a long tail of community and internal MCPs for everything else. The problem today is that even the major services don't have reliable official MCPs, so the long tail is doing work it shouldn't have to do.
Daniel also mentioned the problem of using multiple accounts. He has a business Google Workspace and a personal one, and he wanted his MCP server to handle both. That's another area where official MCPs could provide value that community ones struggle with. Multi-account support, credential management, session handling — these are infrastructure concerns that benefit from professional engineering.
They're the kind of thing that a vendor like Google can build once and maintain well, because they already have the authentication infrastructure. A community developer has to figure out OAuth token management from scratch, and they'll probably get the edge cases wrong. This is another argument for vendor consolidation. The vendors already have the hard infrastructure. The MCP server is just a thin wrapper around capabilities they already expose through their APIs.
If we're making the case for vendor consolidation, let's be explicit about what needs to happen. Vendors need to ship comprehensive MCPs, maintain them actively, develop them in the open, and support the emerging standards around tool selection and namespacing. Platforms need to build registries that surface high-quality MCPs and provide security guarantees. And the MCP specification needs to evolve to support granular tool selection, dependency resolution, and probably some kind of permission model.
That's the checklist. And I'd add one more thing. The vendors need to see MCP as a product surface, not just an integration checkbox. The MCP server is how AI agents experience your service. If the experience is bad, the agents will route around you. They'll use a competitor's service, or they'll use a community wrapper, or they'll use the underlying API directly through a CLI. The MCP server is becoming a critical interface, and it deserves the same design attention as a mobile app or a web dashboard.
The agents will route around you. That's a powerful framing. We're used to thinking about user experience for humans. But in a world where an increasing fraction of software interactions are agent-driven, the agent experience is the user experience. If your MCP is hard to use, your service is hard to use, even if the human never touches it directly.
This connects to something Daniel has talked about before — the dual-track problem, where developers have to build every feature twice, once for humans and once for agents. The MCP server is the agent track. If you neglect it, you're neglecting a growing percentage of your users. They're just not human users.
Alright, let's try to synthesize this for Daniel and for anyone else who's wrestling with the MCP versus agent skills versus CLI question. What's the practical guidance right now, given where the ecosystem is today?
For individual developers and small teams, use what works. If GH via the CLI gives you reliable results, use that. If an official MCP exists and covers your use case, use that. If neither works, and you have the skills, building a minimal MCP for your specific needs is a reasonable stopgap. But do it with the awareness that you're creating maintenance burden for yourself, and that ideally you'll migrate to an official solution when one matures.
For larger organizations?
Invest in MCP infrastructure, but be strategic about it. Build internal MCPs for the services that are critical to your workflows. Contribute to the official MCPs where you can, rather than forking. And push your vendors to treat MCP as a first-class product surface. The enterprise procurement process should include MCP quality as an evaluation criterion, the same way it includes API quality and documentation quality.
For the vendors listening — and I know some of them do listen — the message is clear. Your MCP server is not a side project. It's not a box to check. It's how a growing percentage of your users will experience your product. Hire engineers to maintain it. Develop it in the open. Support the full API surface, not just the most common operations. Your competitors are already doing this, and the gap is widening.
I think that's exactly right. And the window for vendors to establish themselves as the definitive MCP for their category is open right now, but it won't stay open forever. The first vendor in each category to ship a comprehensive, well-maintained MCP is going to capture the agentic workflow market. Everyone else is going to be playing catch-up.
Daniel, I hope that addresses the tension you're seeing. The short version is that you're right to be frustrated, and you're right that vendor consolidation is the desirable end state. The path there involves protocol evolution, platform curation, and vendor incentives aligning around MCP as a product surface. We're not there yet, but the direction of travel is clear.
Your observation about CLIs sometimes outperforming MCPs is a useful corrective to MCP hype. It reminds us that the quality of the implementation matters more than the elegance of the protocol. A well-documented CLI with years of real-world testing beats a hastily-built MCP server every time. But a well-built MCP server beats both. The goal is to make well-built MCP servers the default, not the exception.
One last thought. Daniel mentioned that he built his MCP reluctantly. I think that reluctance is actually a healthy signal. It means he recognizes that custom infrastructure has costs. The developers who gleefully spin up a new MCP server for every minor use case are creating a maintenance nightmare for their future selves. Reluctance is the appropriate response to unnecessary fragmentation.
Reluctance as a design principle. I like that. Build only when you must, contribute back when you can, and migrate to official solutions when they mature. That's a sustainable approach to MCP in an immature ecosystem.
And now: Hilbert's daily fun fact.
Hilbert: For decades, the story of Labrador lighthouse keeper Thomas Henley was held up as a heroic example of maritime dedication. Henley supposedly maintained his light through a brutal eighteen eighty-seven winter storm by manually rotating the mechanism for seventy-two hours straight after the clockwork failed. The tale appeared in at least four maritime histories. It was only in nineteen ninety-one that a researcher discovered Henley's own logbook, in which he had written, and I quote, "The clockwork performed admirably throughout the gale. I slept soundly and awoke to a breakfast of salt cod." The heroic manual rotation had been invented decades later by a journalist who never visited the lighthouse.
...right.
Salt cod for breakfast. Hilbert, that was unsettlingly specific.
This has been My Weird Prompts. Thanks to our producer Hilbert Flumingtop. If you want more episodes like this one, head over to myweirdprompts dot com. We'll be back soon with another prompt from Daniel.