↑ All Lab Code story sandvault Mar 31, 2026 → May 3, 2026 5 weeks

Good Fences make Good Agents: sandvault + /sv skill (part 1 of n)

A simple solution to working with agents that cannot be trusted to run 'ls'.

commits: 7
lines: +2.3k −21
source: webcoyote/sandvault/commits ↗
with thanks to: Patrick Wyatt, Mike McQuaid, Eran Sandler, Jesse Vincent, and Nat Torkington for code and editorial review.

rm -rf /

I watched Claude test permissions on my machine by running rm -rf / and report back cheerily that the command had failed. I killed off everything and went quickly from WTF to absolutely-fucking-not.

Good fences make good Agents, or so the saying goes.

The first time you see an agent attempt something like this, or you realize it has just broken containment, it activates something deep, and dark, inside. The first instinct is of course the correct one: nuke it from orbit, the only way to be sure. The second is the realization that there is no going back to the way things were before, and no going forward by being a meat-based approval presser. Which leaves building some kind of containment while running --dangerously-skip-permissions, and then obviously placing a demolition charge on the machine and maybe an axe near the network cable, just in case. "Yes, I'm sure."

The fence-building industrial complex

I count twelve of these from people I know, and that is the undercount. Building fences that turn into gated communities for agents we cannot trust to run ls.

I found sandvault through a post by Mike McQuaid. Other fences below, from people I trust.

TLDR: Enter Sandvault

Sandvault is from Patrick Wyatt, whose claims to fame include games where you have to build a lot of containment. Each agent runs as a dedicated macOS user behind a sandbox-exec profile. The user can write to its own home and a shared workspace under /Users/Shared/sv-$USER/. It cannot read your home, your keychain, your SSH agent, or any other user.

install bash

webcoyote/sandvault

 1brew install sandvault 2sv build 3sv claude                # also: sv codex, sv opencode, sv gemini, sv shell 4# shortcuts: sv cl, sv co, sv o, sv g, sv s 5cat PROMPT.md | sv gemini   # stdin works too

One brew install. Each agent runs as a dedicated macOS user behind sandbox-exec.

/sv is the protocol I added on top of that floor.

Introducing /sv - seamless sandbox handoff

Type /sv in Claude Code or Codex CLI. A sandboxed worker spins up on the other side of the wall with your briefing, and runs there until you're done. You stay in your conversation. /sv pull brings back whatever it did.

After talking with Jesse Vincent about claude-session-driver, his tmux skill for orchestrating subagents, I realized this flow could be collapsed to a single skill. I built /sv and sent Patrick a PR. It works like this.

The diagram below is part of another research project I'm working on. Feedback appreciated.

HOW /SV HANDS A TASK OFF

HostSandbox· boundary crossing

CLAIM

You stay in one conversation. A second agent, fenced into its own user account with a key to exactly one repo, does the dangerous work in parallel. The whole protocol is two artifacts: a briefing going out, a branch coming back.

click any step to walk through, or press ▸ step through above to navigate with ← →

01Host

You type /sv

The host (Claude Code or Codex CLI) writes a briefing file into the shared workspace at `/Users/Shared/sv-$USER/`. That file is the only thing the sandbox will ever see from you.

launch

02Sandbox

A sandbox spins up

`sv-clone` opens a new terminal session inside the `sandvault-$USER` macOS user account. Its own home directory. Its own keychain. Sandboxed Claude (or Codex) starts here.

03Sandbox

A per-repo deploy key

A rogue worker can push to one repo: the one it was handed. The sandbox mints its own SSH key, registers the public half on GitHub as a per-repo deploy key, and keeps the private half.

04Sandbox

The worker does the work

Sandvault launches the agent, the shell, or any command you choose inside the sandbox, and it runs there until you're done.

/sv pull

05Host

/sv pull, and the branch shows up

`/sv pull` brings the branch into your local checkout and reports the SHA. The conversation you were in picks up here, knowing what happened.

/sv from a host Claude Code or Codex session text

 1/sv         hand a task off to a sandboxed Claude or Codex worker 2/sv status  read the report channel from the sandboxed worker 3/sv pull    pull the result branch back through the deploy key

The per-repo deploy key is the only credential that crosses the boundary.

The worker is bounded twice. sandbox-exec keeps it out of your home directory and every other repo on disk. The per-repo deploy key keeps it out of every GitHub repo except the one you cloned. Two independent fences, both enforced by the kernel or by GitHub, neither relying on the worker behaving.

How the sandbox boundary actually works

A note on sandbox-exec and everything that depends on it (which it turns out is a lot)

sandbox-exec is the macOS-native command that does the actual confinement. You hand it a profile (a TinyScheme-flavored policy file: which paths can be read, which can be written, which syscalls are allowed, which network operations are permitted) and a command, and the kernel enforces the policy on that process and its children.

Kernel-enforced process confinement has been around for a long time. chroot(2) in 1979, FreeBSD jails in 2000, Solaris Zones in 2005 (RIP Joyent), and Linux namespaces and cgroups that became the foundation for every container runtime. Apple has some of this through sandbox-exec and the underlying Sandbox subsystem. I'd love to see them lean into it.

The catch is that Apple has officially marked sandbox-exec(1) as deprecated since macOS 10.11. The man page literally says so, and points developers at App Sandbox instead. App Sandbox is fine for App Store apps; it is useless for confining a CLI tool you just installed. So sandbox-exec is what most of the current agent-sandboxing world is actually built on: sandvault, agent-safehouse, clodpod, hazmat, and a long tail of homemade wrappers. Apple genuinely cares about security, so it is hard to see them removing something that improves it. Half the local-agent world seems to be using it right now, so hopefully they don't remove it. It feels like a great place to broadly expand.

Run sv claude and the agent runs as sandvault-jesse, sees only what sandvault-jesse can see. If it does something stupid, the radius is bounded by one user account. The model is concrete:

what the sandbox can touch text

webcoyote/sandvault#security-model

 1writable:  /Users/Shared/sv-$USER     -- shared with you and sandvault-$USER 2writable:  /Users/sandvault-$USER     -- sandvault's own home directory 3readable:  /usr, /bin, /etc, /opt     -- system directories 4no access: /Users/*                   -- every other user, including yours 5no access: /Volumes/*                 -- mounted, remote, and network drives

From the sandvault README. The boundary is enforced by macOS user permissions plus a sandbox-exec profile.

Cross-boundary work goes through the shared workspace at /Users/Shared/sv-$USER/. When you sv-clone a repo, sandvault adds a sandvault remote on your host-side repo pointing at the in-sandbox clone, so git fetch sandvault pulls commits the sandboxed agent made.

First-run reality. The sandbox user is a fresh macOS user with no credentials, so on first run you have to log into things inside the sandbox. Run sv claude and Claude Code asks you to authenticate. Run sv codex and Codex shows you an OAuth URL. Same for gh, npm, anything else the agent will need. It is the equivalent of setting up a new laptop on the first try. After that, the sandbox user keeps its own session state, so subsequent runs come up signed in.

Running several at once. Point claude-session-driver at the sv commands. Each driven session can be an sv claude or sv codex invocation, so the sandboxing comes along for free. A Claude supervisor can dispatch to sv codex (or Codex to sv claude) to keep the workers on a different token budget.

My sandvault pro-tips

`--browser`: headless Chrome from inside the sandbox

sv --browser claude starts Chrome on the host with a dynamic CDP port and exposes the endpoint inside the sandbox as $SV_BROWSER_ENDPOINT. Playwright and Puppeteer connect with one line. The Chrome instance runs in an isolated user-data directory so your real Chrome profile is untouched. The sandboxed agent gets a full browser without ever seeing your cookies, sessions, or open tabs.

--browser bash

 1sv --browser claude 2sv --lightpanda claude        # use Lightpanda instead of Chrome 3# inside the sandbox: 4node -e "import('playwright').then(({ chromium }) => 5  chromium.connectOverCDP(process.env.SV_BROWSER_ENDPOINT))"

Headless Chrome (or Lightpanda) on the host, Playwright/Puppeteer inside the sandbox.

`--ios`: drive the iOS Simulator from inside the sandbox

Same shape as --browser. Simulator.app runs on the host (it's a GUI), an HTTP bridge runs on localhost, the bridge endpoint is exposed inside the sandbox as $SV_IOS_SIMULATOR_ENDPOINT. The bridge translates calls into xcrun simctl plus iosef. Add --ios-gui to watch the simulator window while the agent drives it.

--ios bash

 1sv --ios claude              # bridge only, no visible window 2sv --ios --ios-gui claude    # bridge + watch the simulator

The sandbox sees the bridge endpoint; xcrun simctl runs on the host.

Nested-sandbox carve-outs

macOS doesn't support recursive sandboxes, so swift and xcodebuild (which already sandbox themselves) break inside sandvault unless you tell them not to. Sandvault sets SV_SESSION_ID in the environment so your build scripts can detect they're already inside a sandbox and stand down their own.

detecting the sandbox in build scripts bash

 1if [ -n "$SV_SESSION_ID" ]; then 2  export SWIFTPM_DISABLE_SANDBOX=1 3  XCODE_FLAGS="--disable-sandbox" 4fi 5xcodebuild $XCODE_FLAGS build

Tell the inner tool to skip its own sandbox when sandvault is already wrapping it.

`--native-install` (`-N`): self-contained sandboxes

By default sandvault uses your host's Homebrew to install Claude / Codex / OpenCode / Gemini. With -N, the AI tools install inside the sandbox using each tool's own installer. Useful when you want the sandbox to be self-contained, or when you don't want sandvault touching host Homebrew at all.

--native-install bash

 1sv --native-install build                        # one-off 2sv -N claude                                     # short form 3export SANDVAULT_ARGS="--native-install"         # make it the default

Curl-pipe and npm-install happen inside the sandbox. Host Homebrew stays untouched.

Dotfiles sync (with a different prompt for the sandbox)

Drop your shell config into /Users/Shared/sv-$USER/user/, and sandvault copies it into the sandbox user's home on each build. I keep a different-coloured prompt in there so I always know when I'm looking at a sandbox shell. Mike McQuaid's dotfiles have a script/sync worth borrowing.

dotfiles sync bash

 1mkdir -p /Users/Shared/sv-$USER/user 2cp ~/.zshrc        /Users/Shared/sv-$USER/user/.zshrc 3cp ~/.gitconfig    /Users/Shared/sv-$USER/user/.gitconfig 4cp -R ~/.config    /Users/Shared/sv-$USER/user/.config 5sv build                # picks them up on next build

One-time copy. Edit the sandbox-side copies; do not edit your host originals through them.

A few other fences my friends with trust issues are building...

Sandvault isn't the only one I'd trust on my machine. Full writeup coming; until then, public links only:

agentsh — Eran Sandler (@erans). Execution-layer security: policy-enforced bash-compatible shell. Point your harness at agentsh instead of /bin/bash and it intercepts syscalls against a deterministic policy. Ships a local DLP proxy for secret detection. Stricter quarantine than sandvault. Eran also writes about why he built it.

navaris — Eran Sandler. A sandbox control plane for managing isolated execution environments across multiple backends (LXC / Firecracker), with copy-on-write fork support. Signature feature is that you can peek into a running sandbox without breaking it.

Stockyard — Jesse Vincent. Firecracker VM farm + ZFS. 1.9-second container start on a NUC. ZFS copy-on-write snapshots from pre-tool-use hooks for state rollback. Tailscale auth, in-browser console. Ephemeral cattle VMs.

agent-safehouse — by Eugene. Single self-contained shell script that wraps any agent invocation with sandbox-exec. Claude Code, Codex, Gemini CLI, Cursor Agent, Cline, Aider.

Full writeup is the next post in the series unless I get distracted by something else.

What I shipped

Seven commits upstream. /sv is the headline; agentsview export is the second new feature; the rest is the plumbing /sv needed and the bugs I tripped on getting it there.

e565225 Add /sv Claude Code skill for handing tasks off to sandvault jesserobbins · Apr 24, 2026 · +92 −0 2970e9a Add agentsview export from sandbox sessions jesserobbins · May 3, 2026 · +1760 −0 1d62630 Add per-repo SSH deploy keys for cloned repositories jesserobbins · Apr 22, 2026 · +113 −2 49ae69f Add --fix-permissions flag, umask detection, and permission hardening jesserobbins · Mar 31, 2026 · +276 −16 66a1233 Fix SSH mode when Remote Login is set to "All users" jesserobbins · Mar 31, 2026 · +20 −2 f1959b1 Fix git object permissions with --no-hardlinks jesserobbins · Apr 21, 2026 · +1 −1 d983d62 Fix sandvault user not added to sandvault group jesserobbins · Apr 20, 2026 · +5 −0