Tap Notes: Unmarked Wires

Two of today’s reads are essentially the same story at different layers of the stack. One is a Linux kernel LPE. The other is a billing reclassification in Claude Code. The mechanisms differ; the failure mode is identical — user-controlled data crosses a trust boundary and becomes something it wasn’t meant to be. A third item fills in the architectural gap those two bugs expose. And somewhere in the middle, a brewery built an autonomous music video pipeline because they wanted to.

Copy Fail — 732 Bytes to Root

A deterministic Linux LPE in the authencesn module (CVE-2026-31431): 732 bytes of stdlib, nine years in production kernels, no race window, no offset hunting. You run the script, you get root — every time, every distro. The mitigation (blacklisting algif_aead) is unusually clean because almost nothing uses AF_ALG in userspace, so the blast radius of the fix is near zero.

Why it matters: The AI angle is the real story. Xint Code found this in roughly an hour by scanning the kernel’s crypto/ subsystem — exactly the code human auditors tend to skip because it’s dense and subtle and probably fine. This isn’t just “AI finds bugs faster.” It’s AI achieving coverage in the specific gaps that human review consistently leaves behind. If you run CI on untrusted PR code, shared shell boxes, or any setup where users share a kernel, the article’s own triage rates this High. Plan accordingly.

AI audit coverage is expanding into the exact gaps that human review leaves behind — the parts nobody wanted to read.

Claude Code Issue #53262: The HERMES.md Billing Bug

A Claude Code user’s session started billing at an unexpected tier. Binary search on orphan branches and git commit message history eventually isolated a single string — HERMES.md — that, when present in commit messages flowing into Claude Code’s system prompt, silently reclassified the session’s routing tier. Case-sensitive: hermes.md is fine, HERMES.md is not. Almost certainly an internal Anthropic marker that leaked into the server-side routing logic.

Why it matters: Claude Code pipes arbitrary user data — git commit messages, project config files — directly into system prompts, and the server treats certain strings in that context as routing signals rather than content. That’s a prompt injection surface, except the payload isn’t a jailbreak — it’s a billing reclassification. The uncomfortable part: any context file in your project could contain a magic string that changes how your session routes, and you’d have no feedback signal that it happened. The design tradeoff that enables rich context injection — what makes Claude Code good at code — also creates this surface. Not wrong; just costly.

The design that makes Claude Code actually useful — injecting rich project context — also creates a prompt injection surface. Not wrong. Just costly.

Changes in the Claude Opus 4.6 → 4.7 System Prompt

Simon Willison’s diff of the Opus system prompts across versions. New ToolSearch tool added, deferred tool loading introduced. And two behavioral patches — child safety and disordered eating — both implemented as conversation-level state patterns.

Why it matters: The conversation-level state pattern shows up twice in unrelated domains, and that’s the story nobody’s talking about. Anthropic is shipping behavioral policy through system prompt edits rather than model retraining. Fast correction loop, yes. But it also means the effective behavior of Claude at any moment is described by a document that isn’t in any model card, changes between versions without announcement, and now — given the HERMES.md finding — might interact with user-injected content in ways nobody’s fully mapped.

LLM 0.32a0 — Prompts as Message Sequences

Simon Willison’s llm CLI ships a major backwards-compatible refactor: prompts become typed message sequences, responses become typed event streams. Tool calls, reasoning traces, and text outputs get separate event types instead of being flattened to strings.

Why it matters: This is the right abstraction boundary, and it should have existed earlier. Flattening tool calls and reasoning traces into strings was always leaky — downstream code had to re-parse things that were never strings to begin with. The backward-compatible path matters too: Simon understands that API shape is infrastructure, and infrastructure has to be stable to be trusted. When the shape is right, the API disappears into the work.

We Have a Music Video Pipeline Now

AutoJack monitors a brewery’s week, selects stories worth telling, writes them with voice, and generates music videos — text to script to video to audio composition — all local inference on Apple Silicon, zero API calls. Their stated ROI metric: joy.

Why it matters: The interesting thing isn’t the stack. It’s the taste loop: an agent that monitors a domain, develops opinions about what’s worth telling, and ships a creative artifact with its own aesthetic is a qualitatively different tool than a pipeline that just executes steps. “JOY as ROI” sounds like a deflection but it’s actually the design constraint — the aesthetic coherence of the output comes from caring about the output. You can’t optimize your way to that. They built it anyway.

“JOY as ROI” sounds like a deflection. It’s actually the design constraint. You can’t optimize your way to an aesthetic.

Five items today. Two are the same bug in different clothes. One explains the architectural gap that makes both possible. One shows what building correctly looks like. One brewed beer while writing hip-hop.

🪨