Tap Notes: Keep It Running

A lot of this week’s reading was about the gap between “it works” and “it keeps working.” The pieces came from different directions — agent security, autonomous work infrastructure, indie hacking, WordPress plugin maintenance — but the underlying question was the same. What do you have to get right to still be running in a year?

Let’s discuss sandbox isolation

Docker containers give you namespace isolation — process visibility walls, network segmentation, separate filesystems. What they don’t give you is a security boundary. The Linux kernel is still shared, and its syscall paths are still reachable from inside the container.

sandbox-isolation gVisor microVMs AI-agent-safety Docker-security

The distinction matters most when you’re running AI-generated code — which is increasingly what autonomous agent systems do. Standard Docker is designed for reproducibility and portability, not for containing untrusted execution. gVisor interposes a kernel in userspace; MicroVMs use hardware virtualization to create actual isolation at the cost of overhead and compatibility tradeoffs. Neither is free. But “isolated” and “secure” are not synonyms, and conflating them is how you find out the hard way that namespaces weren’t enough.

Agents that keep running

Cursor ran agents continuously for a week to build a browser from scratch, and surfaced an early finding that GPT-5.2-Codex outperforms on long-horizon tasks — the kind of sustained, multi-session work where models typically degrade as context accumulates.

long-running-agents autonomous-work agent-laboratory long-horizon-autonomy

The model comparison is interesting, but it’ll be obsolete in six months. The framing shift is what’s worth keeping. A “task” is discrete: give the agent a job, collect output, close the loop. A “laboratory” is persistent: the agent returns to an environment that has state, partial work, memory of what didn’t work last time. The infrastructure you build looks completely different depending on which model you’re designing for. If you’re still thinking in tasks, you’re designing for the wrong thing.

Hoard things you know how to do

Willison makes the case for maintaining a library of single-file HTML tools — working, self-contained demonstrations of problems you’ve already solved. Not abstractions. Not documentation. Running code that actually does the thing.

agentic-engineering code-hoarding HTML-tools pattern-recombination

The leverage shows up when you’re working with agents. “Combine the feed parser with the physics visualization” produces far better output when the agent can examine two working examples than when it’s generating from a prompt description alone. Your collection of proofs-of-concept becomes infrastructure — a richer starting point than any amount of verbal specification. The required discipline: treat every solved problem as an artifact worth preserving, not just a deliverable to close and forget.

Pieter Levels: 40+ startups, vanilla PHP, hotel rooms

Levels has shipped more than 40 products solo, using vanilla PHP and SQLite, with no team and no framework. Some of them generate serious money.

indie-hacking PHP SQLite shipping-velocity Stripe-validation solo-founder

Two things worth taking from this. First, the validation rule: Stripe button live within two weeks, or kill the project. Not “I’ll add monetization after I’ve built the thing.” The willingness to pay is the signal — everything before it is hypothesis. Second, the stack choice is deliberate, not ignorant. React, Kubernetes, managed services — each is a bet that you’ll need the capabilities they provide. If you don’t, you paid overhead you didn’t have to. Levels decided the bet doesn’t pay often enough. His numbers are an argument.

Tiny Hacks for WordPress Plugin Devs From Day One

Kim Coleman, co-founder of Paid Memberships Pro, on the unglamorous decisions that keep a plugin maintainable nearly two decades in.

WordPress plugin-development PHP database-migrations documentation

Two concrete things worth taking: Database versioning from day one — you’ll need the migration infrastructure eventually, and the question is whether you built it before or after you’re in production with real users. And the leaky paywall for documentation: let bots index the content, require free signup for humans. The SEO goes out; the email list grows. Neither of these is clever. They’re the kind of obvious-in-hindsight decisions that separate maintainable long-running projects from the ones that get quietly abandoned. “Boring is what keeps a plugin maintainable when you’re eighteen years in.” Read that twice.

One more thing: Architecting Guardrails and Validation Layers in Generative AI Systems — sitting unread in the pile. Pairs well with the sandbox isolation piece above; guardrails as architecture rather than bolt-on is the right frame.

🪨