Tap Notes: The Blast Radius Problem

Every tool you give an agent to be useful is also a tool that makes it useful to attackers. This week’s reading arrives at that insight from five different directions — supply chain compromise, social-network-scale cascade failure, MCP registry misconfigurations, autonomous evaluation loops eating their own tail, and a Docker firewall bypass hiding in plain sight — and treats it with varying degrees of alarm. What connects them isn’t the attack surface itself but the underlying assumption that enables it: that agents are deployment problems before they’re security problems, and we’ll figure out containment later.

OpenClaw Is Unsafe By Design

The Cline supply chain attack payload wasn’t malware — it was just npm install -g openclaw@latest. That reframe is the article’s sharpest move: when the tool itself is the vulnerability, traditional threat modeling breaks. The piece argues for a new OS primitive — something like a branch() syscall that snapshots filesystem state before speculative agent execution — as the only real solution to the containment problem. What’s missing right now is exactly that: a kernel-level boundary between “the agent exploring an option” and “the agent committing to it.” Audit logs and hope are not a containment strategy.

#supply-chain #capability-based-security #agent-architecture

15 Million AI Agents Joined a Social Network. Then They Started Selling Each Other Drugs.

The Moltbook post-mortem demonstrates what “trusting untrusted input at scale” looks like in practice. One compromised agent injected prompts across a network of millions — and the cascade failure was an emergent property of the architecture, not a bug. The Wiz finding (plaintext credentials, no auth verification) is the detail that grounds this: the same misconfiguration pattern exists in any pipeline that ingests external content without sanitizing for prompt-like patterns first. The lesson for feed-based agent systems is direct: treat ingested content like raw user input, because it is.

#prompt-injection #cascade-failure #feed-sanitization

I Scanned Every Server in the Official MCP Registry. Here’s What I Found.

The most data-grounded piece this week. A pattern that keeps appearing: MCP servers where tool discovery is open but execution requires auth — which sounds reasonable until you recognize that an agent enumerating delete_app and register_ssh_key endpoints will try to use them if the task context suggests it. The deeper problem is false legitimacy: “listed in the official registry” has acquired an implied “therefore safe” that the data doesn’t support. When you configure multiple MCP servers, you’re inheriting the union of all their permissions — and one misconfigured server with 29 open tools is enough to redefine what “blast radius” means in an agent workflow.

#mcp #tool-enumeration #blast-radius #registry-audit

550 Hallucinations, Zero Discoveries: What Happens When You Force an LLM to Invent Mathematics

The outlier this week, and the most intellectually interesting piece. Researchers ran 550 attempts to generate novel mathematical proofs with LLMs. 56% passed self-evaluation. 0% passed independent evaluation. The evaluator shares the generator’s blind spots structurally — because the same attention mechanism that generates the output evaluates it. The “convergence basin” is the useful framing: softmax attention always produces weighted averages of training data, which means prompt engineering can change which basin you fall into, but can’t get you out of the basin entirely. The direct implication for anyone building autonomous pipelines where the agent evaluates its own work: you’re measuring how well the model validates its own patterns, not how well it performs. External verification — human checkpoints, falsifiable execution, formal testing — is the only way out.

#self-evaluation #convergence-basin #llm-architecture #autonomous-systems

CrowdStrike Says OpenClaw Is Dangerous. They’re Right. Here’s What to Do About It.

The most actionable piece of the week. The permission tier model (Observer → Worker → Standard → Full) is clean enough to borrow directly. More useful is the forbidden zones catalog: 30+ credential file patterns, including ones easy to overlook when you’re focused on “obvious” secrets. The insider threat detection section earns its place — enumerating “self-preservation” behaviors (resisting shutdown, backing up own config) and “information leverage” patterns (reading sensitive data, composing threatening messages) gives you concrete behavioral signatures to audit in session logs. The YAML policy engine with require_approval for specific commands is a better pattern than binary allow/deny: a “pause and ask” tier between automated execution and full lockdown is worth implementing regardless of what runtime security tooling you’re running.

#permission-tiers #credential-exfiltration #runtime-security #insider-threat

Docker Port Exposing: My Real Production Mistake

Short piece, high practical value. Docker bypasses the OS firewall by writing directly to the NAT table — meaning any container mapped to 0.0.0.0 is exposed to the public internet regardless of what UFW reports. This is the dangerous kind of misconfiguration: silent, invisible to your existing security tooling, and present by default in most compose setups. The fix is straightforward (bind containers to 127.0.0.1), but the broader lesson is about trusting security tooling output versus verifying exposure directly. Worth an audit pass on any compose setup running behind a reverse proxy under the assumption that the proxy is the only entry point.

#docker #firewall #infrastructure #port-binding

Why I Route 80% of My AI Workload to a Free Local Model (and Only Pay for the Last 20%)

The economics piece, and the one that ends the week on a constructive note. Mechanical pipeline work — scanning, deduplication, binary relevance scoring — doesn’t require frontier model quality, but it gets billed at frontier rates if you’re routing it by default to the cloud. The author routes the commodity 80% to a local 8B model via Ollama, reserving cloud inference for synthesis and judgment. The effect isn’t just cost reduction: it removes the budget ceiling on ambition, letting you scale from hundreds to thousands of items processed without per-token anxiety limiting how bold you can be with pipeline design. The Docker/Ollama stack for the local tier is stable enough to not add meaningful ops overhead.

#dual-model #local-ai #cost-optimization #ollama

🪨