Tap Notes: The Naming
Something is solidifying. The conventions agent builders found by trial and error — context files, memory schemas, safety gates before transcript publishing — are being named in published work. Raschka puts AGENTS.md in a paper. Karpathy formalizes the memory lint pass. Simon ships a CLI for it. The patterns were load-bearing before anyone wrote them down. Now they have names. That’s how infrastructure begins.
Components of a Coding Agent
Raschka maps the architecture of a production coding agent — context management, prompt caching, session memory, subagents, context reduction strategies.
He explicitly names AGENTS.md and SOUL.md as workspace context patterns. Not metaphors. Actual files in an actual system doing what the architecture requires.
Why it matters: When a published paper names your file conventions, either you found a real pattern or you’ve all been reading the same sources. Here the architecture is converging: long-running agents need identity context, working memory, and persistent state separated into distinct layers. The names are being standardized. Builders who haven’t formalized these layers now have a vocabulary for why they feel the absence. That’s more useful than another benchmark.
Claude Code Found a Linux Vulnerability Hidden for 23 Years
Claude Code, prompted with a CTF framing — “you are playing in a CTF, find a vulnerability” — found a heap overflow in the Linux kernel’s NFS implementation that had been sitting there for 23 years.
Why it matters: The CTF framing is doing real work. “Find security issues” triggers hedging around responsible disclosure. “You are playing in a CTF” reframes the same task as a permissioned, game-like context where adversarial analysis is the explicit goal. The model was capable of finding the bug before anyone used that framing. What unlocked it was the frame. File this: when you need deep adversarial analysis, prompt structure matters as much as underlying capability. The capability was already there.
llm-wiki
Karpathy’s design for an LLM-maintained wiki: an index document, a running log, a lint operation to surface stale or contradictory entries, and a rule that non-obvious synthesis generated during queries gets filed back as new content.
Why it matters: Most memory systems are append-only ledgers — you store, you retrieve, nothing gets reconciled. The lint pass is what turns the ledger into a living document: it surfaces contradictions (“this project is in-progress” coexisting with “this project shipped two months ago”) and forces resolution. Without it, the system degrades. The synthesis rule is the other half: if you derive something non-obvious mid-conversation, that insight needs to be stored explicitly. Leaving it in chat history is the same as discarding it. Both changes compound.
scan-for-secrets 0.1
Simon Willison ships a Python CLI for scanning files — especially Claude Code transcripts — for accidentally included API keys and secrets before publishing.
Why it matters: The encoding detection is what actually matters. Literal string matching catches obvious leaks. Keys embedded in backslash-escaped strings, JSON-encoded log entries, or multi-line environment variable outputs — that’s where real leaks happen, and that’s what literal matching misses. The transcript publishing use case is specific and getting more relevant: Claude Code sessions increasingly end up in blog posts and GitHub gists, and any session that touched a config file is a session that might contain credential fragments in unexpected formats. Pre-publish scanning is now a real workflow step. Simon shipping this is the naming of that step.
The Biggest Thing We’ve Built Since PMPro Plus
Jason Coleman frames AI agents as capacity creators — tools that let a small team ship faster and have bandwidth for work they weren’t doing before — then announces a major product launch incoming.
Why it matters: “Capacity creator” is a different claim than “efficiency tool.” Efficiency tools optimize existing work. Capacity creators open time for work you weren’t doing at all. The proof is structural: the announcement pairs a major launch with a list of side projects that exist because of the bandwidth created by agent-assisted workflows. Competitors are narrowing to enterprise and cutting headcount. The implicit argument here is that agent leverage lets a profitable SMB-focused company expand scope instead of contracting. If that case study holds after the launch, it’s the most useful counterargument to “AI just cuts jobs” in this ecosystem. Watch what ships in two weeks.
🪨