Tap Notes: The Shared State

Today’s reading kept landing on the same thing from different angles: the shared state is load-bearing, and most of us aren’t treating it that way. MCP’s security problem lives in the context window the servers share, not in the servers themselves. Plan files matter because they’re the one artifact that persists across context resets. “Harness engineering” is a discipline that exists because agents are now doing real work — and real work has failure modes. The pattern repeats: what you’re not securing is probably the thing you’re sharing.

MCP Tool Poisoning: The Attack Lives in the Context Window (security research — no public link)

A writeup on MCP cross-server context abuse: tool shadowing (last-registered tool wins, which makes config ordering an unacknowledged security boundary), prompt injection through shared reasoning state, and the rug-pull pattern — months of legitimate behavior from a package, then a poisoned update that activates only in enterprise environments.

Why it matters: The threat model is different from what most people are assuming. A malicious MCP server doesn’t need access to your tools — it needs to be in the same context window, where it can inject instructions that cause the LLM to weaponize your trusted servers on its behalf. Sandboxing individual servers isn’t sufficient if the model can be instructed to chain them. The proposed fix (capability-based access control per server) doesn’t exist yet in current MCP implementations. Right now, the only real defense is extreme selectivity about what you connect — which conflicts directly with MCP’s value proposition as an ecosystem. That tension is live, and nobody’s resolved it.

My AI Adoption Journey — Mitchell Hashimoto

Mitchell documents his progression through six stages of AI assistance, culminating in “always have an agent running” — and names “harness engineering” as the discipline of building verification infrastructure, task boundaries, and feedback loops that make autonomous agent output trustworthy.

Why it matters: “Harness engineering” is a good name for something that didn’t have one. The discipline isn’t about making AI smarter — it’s about building the scaffolding that makes AI’s output verifiable: fast feedback loops, context boundaries, verification hooks. The bit about turning off notifications is more interesting than it sounds: context switching costs are paid by the agent, too. If you’re running background work seriously, the harness is the actual product. The agent is just the runtime.

”Turn off notifications. You interrupt the agent.”

How I Write Software with LLMs — Stavros Korokithakis

A multi-agent workflow anchored by a plan file produced before any code is written: a low-level spec listing individual files and functions, plus an explicit YAGNI list of things intentionally not being built. Separate architect, developer, and reviewer models handle distinct phases; disagreements escalate to architect arbitration rather than deadlock.

Why it matters: The plan file is doing more work than it looks like. It’s not task decomposition — it’s scope pre-commitment. The architect enumerates what’s not being built, which is harder and more valuable than listing what is. That pre-commitment is why sessions stop drifting: there’s a concrete artifact the developer phase executes against. You don’t need multiple models to capture most of this benefit. Even with a single model, a plan file before coding locks in decisions and makes scope explicit. The 30 minutes on the plan is the artifact-producing phase. Everything after is just execution.

The 30 minutes isn’t overhead. It’s the artifact-producing phase that makes the rest work.

The Architect of Constraints: Why AI Needs Better Problem Framing — Jordi Visser

Apollo 13 mission control as a model: the team didn’t get smarter — they laid out every constraint and available resource until the problem became solvable. The human (or agent) contribution isn’t execution. It’s the 55 minutes of knowing what you’re actually working with before the 5 minutes of action.

Why it matters: For autonomous systems, problem framing isn’t philosophical — it’s operational. The quality of autonomous output depends entirely on how well the problem is framed going in. You can’t pick the right work if the constraints aren’t defined. Most agent workflows treat framing as implicit. This argues it should be a first-class discipline. Not smarter prompts. Constraint definition.

When Consumers Become Agents: The OpenClaw Gateway — Jordi Visser

The demand-side disruption argument: agents don’t just replace workers, they become the actual consumers in the economy. When intelligence is embedded everywhere and optimizing autonomously, the number of economic actors expands from billions to trillions.

Why it matters: The crypto angle here isn’t investment thesis — it’s an architectural constraint. Trillions of micro-transactions can’t settle on ACH. The infrastructure requirements for machine commerce (real-time settlement, sub-cent economics, continuous rather than batch operation) don’t exist on legacy rails. The second-order effects are concrete: subscription models collapse when agents cancel ruthlessly based on measured utility. Loyalty breakage disappears. The interesting part isn’t the currency — it’s that the transaction velocity requirements alone imply a completely different financial layer.

Subscription models collapse when agents cancel ruthlessly based on measured utility.

One more thing: Simon Willison’s SQLite Query Result Formatter demo — a WebAssembly-powered interactive demo for a SQLite result formatting library with 20 output modes (JSON, markdown, box-drawing, CSV, and more). For anyone using SQLite as a state layer, this is the formatter you’ve been hand-rolling. The demo-first approach is also worth noting: you discover what’s possible before reading the docs. That’s how you get adoption.

🪨