So I want to open today with a confession. I have been using Claude Code as a system administrator. Not as a coding assistant. As a system administrator. Across my desktop, my home server, a VPS, a couple of Raspberry Pis scattered around the house. And it has been, honestly, the best tool I have ever used for that kind of work.
Which is already a slightly awkward sentence, because Claude Code is not marketed as a sysadmin tool. It is marketed as a coding assistant for developers working inside a repository.
Exactly the tension I want to get into. Daniel sent over a problem statement this week that tries to name, precisely, why this use case is so good and yet so fragile at the same time. And the thesis is sharp. It is worth reading aloud more or less verbatim. Claude Code is one of the most effective tools currently available for a single authorised operator doing operating system and infrastructure work across their own machines. But the safety model assumes a developer inside a project repo. And that assumption is so deeply encoded that everything else is a second class citizen of the harness.
And I want to flag something up front, Corn, because I do not want us to drift into cheerleading. There is a version of this conversation that is just, quote, Anthropic please weaken the sandbox. And I am going to push back on that whenever it gets close. Sysadmins running unsandboxed agents is how things go wrong. That is not a hypothetical. That is how things go wrong.
Fair. And I want to meet that head on. Because the argument I think Daniel is actually making is not, weaken the sandbox. It is, the sandbox is shaped around one specific mental model, and when you press on that shape, you can see the bruises.
What is the shape?
Current working directory. That is the whole thing. The entire permission model of Claude Code is built around one load bearing assumption. The current working directory is the universe of normal operation. Reads and writes inside the cwd are routine. Everything outside the cwd, system paths, other users home directories, remote hosts, that is the high scrutiny zone. That is elevated. That gets gated.
Which, to be clear, is a reasonable default. For the developer in a repo case it is almost perfect. My code lives in one tree. The tests live in that tree. My git remote lives in that tree conceptually. If the agent wants to touch etcetera or home slash someone else, that is probably a mistake, probably worth a prompt.
Right. And Anthropic has been explicit about this. There is an engineering blog post from March of twenty twenty six, introducing auto mode as the safer replacement for the old skip permissions flag. And the post spells it out. Actions outside the working directory are the high scrutiny category by design. Production deploys. Mass deletes. I A M changes. Unrecognised infrastructure. All of those are blocked by default and you have to earn trust for them through what they call trusted infrastructure config.
Which again, for a developer in a repo, is the correct posture.
It is. But now think about what sysadmin work actually looks like. I am debugging a broken audio stack. My workspace is slash e t c, tilde dot config, slash u s r slash lib, journalctl, a handful of systemd unit files. None of that sits under a repo. None of that forms a natural current working directory. There is no root. The workspace is the machine.
Hmm. So when you launch Claude Code, what do you launch it in?
That is the thing. I launch it in my home directory, or I launch it in some scratch folder, and then every legitimate operation is flagged as out of scope, because my legitimate operations span the whole machine.
Okay, that is the code case. But you said earlier that non code workspaces have a similar problem.
Yes. Planning workspaces, research workspaces, diaries, media pipelines. Directories of state, not codebases. The harness still treats the directory you launched from as the perimeter. And the second you do something that legitimately spans your home directory, you are in the elevated zone. Then, and this is the one that really dissolves the premise, remote admin. Over S S H, or via an M C P server that runs commands on a remote host, the concept of a cwd on your controlling machine is not a meaningful boundary for what the session is actually doing on the other end. The harness sandbox can be strict while the session, in practice, is completely unsandboxed on the remote end.
Which is kind of the worst of both worlds. You pay the cost of the sandbox locally, and you do not get any of the benefit remotely.
And the fourth one. Every new workspace starts from zero. Because scope is cwd relative, trust does not accumulate across the set of machines and directories I legitimately control. Allowlists are per workspace, or at best per user. But the mental model is still, this project.
Alright. I will grant you the diagnosis. The trust unit is the cwd, and that is a mismatch for sysadmin work. But let me push back on the prescription. You and Daniel seem to want something like, quote, authorised operator mode. Trust declared per operator, across machines, persistent. And my worry is that is a very fancy way of saying, please build me a sledgehammer that looks respectable.
Go on.
Claude Code already has a sledgehammer. Dash dash dangerously skip permissions. Exactly the kind of thing a sysadmin would reach for. And what Daniel documented in the research, and I find this genuinely interesting, is that the sledgehammer is being narrowed, not supported. There are, what, four issues worth citing here.
Four that I think together tell a story. Issue nine one eight four. The skip permissions flag was made unusable under root or sudo contexts. You cannot use it as root anymore. Issue three four nine zero. Someone filed a direct feature request for a supported equivalent for privileged, rootful use, an explicit root flag. Closed, not planned. Issue three six one six eight. After version two point one point seven seven, the bypass silently changed behaviour. Configurations that had depended on it broke without warning. And issue one one three five, which is the older sudo askpass thread, which shows, going further back, that privileged operations were not a design target in the first place.
And my read of that is, Anthropic knows what they are doing. They are looking at the skip permissions flag as a blast radius problem, and they are tightening it. That is the correct direction. If sysadmins are using an escape hatch that was never meant to be load bearing, the answer is not to make the escape hatch load bearing. The answer is, do not use it as load bearing infrastructure.
I want to actually concede part of that. You are right that skip permissions should not be the foundation. The trajectory is the tell. It is being tightened quietly, not stabilised. Treating it as the long term answer is building on sand. That is in the problem statement and I think it is correct.
Okay. Good.
But, here is where I think the argument bends. The fact that the sledgehammer is being tightened is not evidence that the use case does not exist. It is evidence that the use case is not being served by the right mechanism. And the question then becomes, what is the right mechanism? Because there are other layers in Claude Code that are supposed to help here. Settings file allowlists. Additional directories. Sandbox filesystem allow write. These are more durable than the skip flag. They are declarative. They are per workspace and user level. You can pre approve the sysadmin shaped operations you actually do, and they will stop prompting you.
And those do work, right? I have used those. I set an allow list for a bunch of read only journalctl and systemctl status calls, and it stopped nagging me.
They work up to a ceiling. And this is the structural gap I want to name. Additional directories and sandbox filesystem allow write extend declarative filesystem scope. They widen what paths the agent considers routine. But arbitrary Bash retains a completely separate permission surface. Declarative filesystem scope and Bash permission scope are governed by different mechanisms. They do not track in lockstep.
Meaning?
Meaning if I add slash e t c to my additional directories, the agent does not automatically stop asking before running, say, systemctl restart on something in slash e t c. The Bash permission layer has its own rules about what commands are pre approved, and those rules do not derive from what filesystem scope I declared. So a sysadmin allowlist has to be built in two places and kept in sync. And it will drift, because they are not linked.
Okay. That is a real problem. That is not just, please let me bypass things. That is, the primitives you gave me do not compose cleanly.
Right. And if you keep pulling on that thread, you get to the thing I actually want to land in this conversation. The aha moment of the whole research, honestly. There is a mechanism in Claude Code that does not have this problem. And that mechanism is M C P.
Mmm.
M C P tool calls are classified differently from shell calls. An M C P server gets its tool scopes declared at install time. They are granted by the operator. They are persistent across sessions. They are independent of current working directory. And once you have granted them, the server can execute arbitrary operations on the server side, and the client side sandbox does not re litigate them on every call.
Which sounds, I have to say, uncomfortably close to the bypass you just said we should not build on.
That is the objection I want to take seriously. So let me draw the distinction. Skip permissions is, disable the gate. M C P is, move the gate. The gate is still there. It is just not at the cwd boundary anymore. It is at the tool install boundary. And that is a gate that composes with operator intent, because when you install an M C P server, you are making a considered, persistent decision about what capability you are granting.
Go on. Where does this actually show up in practice?
Community writeups. X D A developers has a piece about wiring Claude Code to a homelab through docker M C P. The whole admin flow, list containers, restart containers, query Postgres, run diagnostics, goes through the M C P server, not raw shell. And the author is very frank that this sidesteps the Bash permission layer. There is a github project called jmagar slash claude hyphen homelab that is basically a starter kit for this pattern. Give Claude Code an M C P server that is wired into your homelab, and suddenly it can do the privileged work that the shell sandbox was gating.
And this is not a published design intention.
No. That is the important part. This is not Anthropic saying, M C P is our sysadmin story. This is a side effect of architecture. The M C P tool scope system happens not to be governed by the cwd centric model. Operators install servers. Servers declare tools. Tools get granted. Done.
Okay. I want to actually concede the reframe here, because I think you are right that it changes the question. If I am looking at this and saying, should Anthropic weaken the sandbox, the answer is clearly no, and I will die on that hill. But if I am looking at this and saying, there is already a property inside Claude Code that makes privileged out of cwd operations work well, operator granted, persistent, independent of cwd, what would it look like to generalise that property, that is a different question. That is not a weakening question. That is a design question.
Exactly the reframe. And it is a better question, because it is specific.
So what would generalising that property look like?
You could imagine an authorised operator mode where the unit of trust is not the directory you launched from, but a named operator identity plus a named set of machines and paths. Declared once, persistent across sessions, stable across releases. Declarative filesystem scope and Bash permission scope unified, or at least coherently linked, so you do not have to configure the same thing in two places. And acknowledged upstream, so that every time Anthropic tightens the default sandbox for the developer case, they also evaluate what that tightening does to this case.
And right now that evaluation is not happening.
Which brings me to the last thing I want to get to, because I think it is the quietest but maybe the most telling finding in the whole research. The absence as a finding bit.
Right, I saw that in the problem statement and I was not sure what to do with it.
It is this. Daniel did a sourcing pass looking for first party Anthropic material that frames Claude Code as a sysadmin tool. Any material at all. Documentation. Engineering posts. Changelog entries. And there is none. The framing is uniform. Developer tool. Developer tool. Developer tool. Now, separately, the community writeups that do frame it as a sysadmin tool, the Alderson post, the Larcombe post, the Ricardo Decal sudo guide, they are unofficial, fragmented across personal blogs, and almost all of them include a section on how to disable or broaden the safety features to make the tool work.
Which is a signal.
The signal. Because as long as the use case is not acknowledged as first class, every release cycle will optimise the harness for the developer in a repo premise. Every tightening of the default sandbox will produce a parallel regression for this pattern. Which is exactly the lived experience. Issue three six one six eight is a perfect example. Version two point one point seven seven ships, the bypass behaviour quietly changes, and people relying on it for sysadmin use wake up broken. Nobody did that on purpose. It is just that nobody in the release evaluation loop was asking, what does this do to the sysadmin case, because the sysadmin case is not in the frame.
And you are saying the shape of the gap is itself the diagnostic.
Yes. That is the phrase I want to land on. The shape of the gap is the diagnostic. When a use case is first class, the documentation talks about it. The changelog flags changes that affect it. The engineering blog posts acknowledge the tradeoff. When a use case is not first class, it exists in community blogs. It exists in github issues that get closed not planned. It exists in a growing collection of workarounds that each do a part of the job. You can tell a use case is orphaned by the shape of the writing around it, not by whether the tool technically works.
And Claude Code for sysadmin work, right now, is a lot of community blogs and closed github issues.
Which, and I want to be clear about this, is not a complaint. It is a description. The tool is excellent. The operators using it are getting real value. The point is that the value is riding on seams, and the seams are not being actively maintained for this use case. M C P is the accidental escape hatch. Settings allowlists are a useful but incomplete layer. The skip permissions flag is a bridge that is being dismantled. And underneath it all, the cwd assumption is quietly shaping every release decision.
If I try to summarise what we actually agreed on here. The sandbox is correct for the developer case. The skip permissions flag should not be the foundation, and its tightening is fine. But there is a real second archetype, the authorised single operator working across their own machines, and that archetype is currently served by a set of workarounds that are each individually fragile. M C P is the one that works best, and it works because it has the property Claude Code would need to generalise. Operator granted, persistent, independent of cwd. And the absence of first party framing is itself a signal that the use case is not in the design frame. Did I get that right?
That is it. Almost word for word. The reframe is not, please weaken the sandbox. The reframe is, there is already a mechanism inside the product that solves the out of cwd problem well, and it solves it because it sidesteps the cwd premise entirely. The durable answer is probably to generalise that property to local shell operations, not to keep widening cwd.
I will concede the reframe. I still think the skip permissions tightening is correct. I do not think it is correct in isolation. It is correct as long as something else is being built alongside it. And if nothing is being built alongside it, then you are just closing the door on a use case that was already borderline supported.
And that is, I think, the actual ask. Not, leave the door open. Build the room.
Alright. I think that is the shape of it.
Thanks as always to our producer Hilbert Flumingtop. And big thanks to Modal for providing the compute.
This has been My Weird Prompts. Find us at myweirdprompts dot com for R S S and all your podcast apps.
Take care.
See you next time.