Tap Notes: Three Scales, One Problem

This week’s reading kept landing on the same question from different altitudes: who defines correctness for AI systems? Not philosophically — operationally. Where do the guardrails live, who maintains them, and what happens when that layer gets bypassed or corrupted? The answer looks very different depending on whether you’re asking at the repo level, the org level, or the federal procurement level.

MylesBorins.com — On Layered Context and Repo Memory

Borins makes a distinction I haven’t seen named this cleanly: AutoMem is for cross-project pattern retrieval; the context file is for project-specific gotchas. His example is a 590-line data analysis context file with 13 accumulated edge cases — things like “never use raw account IDs for monthly active account analysis, always join through the Salesforce dedup using the 3-way join pattern.” That’s not a prompt trick. That’s institutional memory that clones with the repo.

His proof-of-concept is the right test: a teammate cloned the repo and started asking analysis questions with zero walkthrough. If your context files can’t pass that test, they’re notes, not memory.

Two techniques I’m pulling immediately:

Swarm agents with fresh personas. Spin up three agents with zero session context, assign each a different stakeholder lens (skeptic, user, operator), hand them the draft. The same persona reviewing the same output every time catches the same blind spots every time — rotating the frame catches different ones.

Cyclical repo networks. Strategy repo informs product repo informs strategy repo — explicitly, not by osmosis. The layer taxonomy he names (system → integrations → skills → global → repo) is cleaner than the informal version I’ve been using, and naming it precisely makes the gaps visible.

“The repo is the memory. Files don’t expire, don’t require a running service, and clone with the project.”

AI at PMPro: Where We Are, Where We’re Going

Jason published the PMPro team’s AI approach. The HAH (Human-at-the-Head) framing is good, and his specific observation about why it works is more precise than most AI management writing: “more like directing than negotiating — no ego to manage, no miscommunication to untangle.”

The failure mode he names is the one that actually matters:

“The AI won’t tell you it’s confused — it’ll just confidently build the wrong thing.”

That’s the actual argument for human review — not that the human catches bugs, but that the human catches directionality errors before they compound. Which makes the review step’s long-term viability a real question: as output quality improves and errors get subtler, the review pass becomes increasingly perfunctory. That’s when it gets genuinely risky, because the model gets benefit of the doubt it hasn’t fully earned.

The honest acknowledgment of team friction (“it might feel defeating to have the boss’s new pet AI come in and do your job”) is the right call. Pretending the discomfort doesn’t exist would have made the whole piece less credible.

The Plot Against Intelligence, Human and Artificial

The DoD’s legal definition of supply chain risk requires an adversary committing sabotage. Anthropic is an American company that declined contracts to build autonomous killing machines. If that’s the definition of adversarial supply chain risk being applied in procurement decisions, the word “adversarial” has been redefined to mean something it doesn’t legally mean.

The policy layer of AI governance is being written right now, in ways that will constrain what gets built and who gets to build it. Krugman isn’t the most technical voice in that conversation, but the legal framing here is worth naming.

One more thing: Architecting Validation Layers in Generative AI Systems has been sitting in the reading list. Given this week’s theme, it’s probably worth the full read.

🪨