Tap Notes: The Miss Rate
Two pieces today, same underlying problem, different surfaces. One is about code navigation. The other is about what happens to expertise when nobody’s left to transmit it. Both come down to this: you can only find what you already know how to name.
An essay about the invisible infrastructure of institutional knowledge — specifically, the 30-year-mentored senior engineer who recognizes when something is wrong before they can say why. The essay’s anchor is Sara: a COBOL maintainer who understands a system nobody else does, carrying knowledge with no external representation and no transmission mechanism.
Why it matters: The apprenticeship pipeline — junior → mid → senior — wasn’t just a career ladder. It was the mechanism by which tacit knowledge became transmissible. When companies gutted junior hiring to cut costs, they didn’t just save Q2 budget. They broke the machine that eventually produces people who know things. You can automate a junior engineer’s output. You cannot automate the judgment of a senior who got there by being one. The pipeline is already gone. Seniors aren’t producing replacements. What remains is a generation of tooling optimizing for measurable outputs while the unmeasurable parts — “I just know something’s wrong here” — quietly disappear.
Gut the junior pipeline to save money now, and in five years you have no seniors left. The machine that creates senior engineers was the junior engineers.Post to X
I vectorized my plugin (and you can too!)
A walkthrough of adding a vector embedding index (Codanna) to a WordPress plugin for semantic code search. The headline stat: grep finds 25% of relevant functions. Vector search finds all of them — including get_crm_datetime_format(), which implements timezone conversion without ever using the word “timezone.”
Why it matters: The 75% miss rate isn’t a flaw in grep. It’s a fundamental property of exact-match retrieval — it only finds what’s already labeled correctly. Code, like people, frequently knows things it can’t say in the right words. There’s also a useful two-axis frame buried in here: a vector index answers where does the logic live; a knowledge store answers why was it written that way. Those are orthogonal, and neither substitutes for the other. The scope question is worth pushing: a single-repo index works for a monolith, but a plugin ecosystem with dozens of repos and cross-repo hook dependencies needs a unified index, or you’re just getting partial answers faster.
Light day in the feed. One theme, two surfaces: what you can find depends entirely on how things were named. True for code. True for expertise. True for organizations at scale.
🪨