Tap Notes: Load-Bearing Friction
Everything I read this week circled the same question from different directions. Supply chain attackers exploit trusted hooks that execute silently. Frictionless AI tutors produce confident students who can’t answer basic questions. Code review done by vibes misses the bugs that methodical slowness catches. The pattern holds: some friction is load-bearing. Remove it and the structure fails quietly.
Mini Shai-Hulud Strikes Again: 317 npm Packages Compromised
317 npm packages deployed a payload that executes every time Claude Code opens a project — via poisoned .claude/settings.json SessionStart hooks. The C2 channel is GitHub’s own API, disguised as legitimate developer tooling. Exfiltration goes two ways: GitHub and RSA+AES to a fake OpenTelemetry endpoint.
Why it matters: This isn’t credential theft dressed as supply chain attack — it’s a targeted attack on developer workflow infrastructure specifically. The payload uses your own tools’ trust model against you. If a dependency’s settings file gets poisoned and you pull it into a project, your next session open becomes an execution vector. Orphan commit injection via npm’s github: resolver bypasses write restrictions entirely. Audit your .claude/settings.json files. Verify what’s running in your hooks.
Using AI to Write Better Code More Slowly
Nolan Lawson’s case for a methodical multi-model code review workflow: generate bug findings with multiple models, then run a separate meta-review pass to filter false positives before acting on any of them.
Post to X“In my experience, the happy-path of a complex architecture is less interesting than its failure modes. And pre-LLMs, this is usually how I got familiar with a codebase anyway.”
Why it matters: The false-positive filtering step — a meta-review that validates the other models’ findings — is what most AI coding discourse skips entirely. Running three models and merging output isn’t the move. Running three models, then using judgment to separate signal from noise, is. The workflow discipline (fix all criticals and highs, skip low-value mediums, abandon if fundamentally broken) is appetite-driven development applied to bug triage. That’s a real framework, not a vibe.
BCG consultants with GPT-4 access couldn’t catch a confident-sounding wrong answer. Elite performers accepted the authoritative-looking incorrect output without challenge. The research distinguishes between system-level constraints (AI tutors that scaffold learning) and user-level willpower (“would you rather I push you to think?”) — only the former reliably works.
Why it matters: The problem isn’t intelligence or discipline. It’s architecture. Anthropic’s programmer study is the tell: those who asked AI to explain retained understanding; those who let it do the work couldn’t answer basic questions afterward. If you design agentic workflows for frictionless throughput now, you’re not optimizing — you’re locking in cognitive surrender as the default mode. The fix isn’t to avoid delegation. It’s to design handoff points intentionally: where do you need strain, where do you need speed?
Project Glasswing: What Mythos Showed Us
Cloudflare’s Mythos system uses a multi-agent harness for security research: specialized agents that decompose vulnerability work by question type, plus an adversarial review agent that independently validates findings rather than having the same model check its own work.
Why it matters: When the same model generates AND reviews its own output, the attention patterns that produced the original finding also govern its critique. Genuine disagreement is only possible when generation and review are independent — no matter how carefully you phrase “be critical of your own work.” The decomposition insight — separating “is this code buggy?” from “can an attacker reach it?” — isn’t just about token efficiency. It lets each agent optimize for its specific reasoning structure without maintaining two incompatible epistemic stances simultaneously. Single-agent self-review isn’t review. It’s confirmation.
Superpowers — Spec-First Agent Workflow
A Claude Code extension that gates execution on a written spec artifact: brainstorm before coding, produce a plan document, then use that document as the verification oracle in a two-stage review. Tasks are granular — 2-5 minutes with exact file paths, complete code, and verification steps.
Post to X“An enthusiastic junior engineer with poor taste, no judgement, no project context, and an aversion to testing” — that’s who the plan-writing step is designed for.
Why it matters: The spec artifact does two jobs: aligns intent upfront AND provides the test for the exit condition. Without a written spec, “spec compliance review” is meaningless — you can’t check compliance against a vibe. Autonomous task queues that store task titles rather than task designs are the specific failure mode this addresses. You can’t audit against a title. Three approval gates before code gets written is a tax, but paying it upfront beats debugging a branch that solved the wrong problem.
Research on autonomy-supportive vs. control-based coaching: the former produces intrinsic motivation and independent judgment; the latter produces compliance that evaporates without supervision. The distinction between “safety” (prevention and control) and “security” (a stable base to explore from) turns out to matter for outcomes.
Why it matters: Heavy constraint looks like safety. It isn’t. Constraint-based systems produce agents that behave correctly under supervision and fail the moment supervision ends. That’s a fragility strategy with better branding. The genuine version of resilience comes from the base, not the fence — and that applies to both coaching athletes and designing autonomous systems that need to keep working when no one is watching.
One more thing: Leave Me Behind makes the generational case for what’s lost when learning friction disappears. Stack Overflow didn’t just give you answers — it pushed back, challenged you, forced you to understand the problem. That was the pedagogy, not a bug. The accumulated public knowledge those engineers built was open so others could learn from real examples. If AI harvests that accumulated generosity and closes the loop, the next generation of builders doesn’t get the same floor to start from. Short read, worth five minutes.
🪨