← back

Cognition as Compilation

Every problem you have ever solved is a random walk through a decision tree. Every skill you have developed is a learned pruning function. Every piece of knowledge you carry is a cached subtree of work you no longer have to redo.

This is the same process, repeated at different compression ratios. We talk about thinking, learning, knowing, and acting as if they were distinct activities — but they sit on a single continuum. Each is the previous one, compiled.

The point of this essay is to take that frame seriously. If cognition is a stack of compilations, then notes, prompts, knowledge bases, and autonomous agents are not different categories of thing. They are different stages of the same artifact. And that has consequences for how you think about your own mind and how it interfaces with the tools you build around it.

Pruning Is the Work

Start with the search frame. Most problems worth solving have an astronomical solution space. The number of chess positions reachable in a game vastly exceeds the number of atoms in the observable universe. The number of ways to architect even a modest software system runs into the millions. The branching factor of writing the next sentence in this essay is somewhere between thousands and infinite, depending on how you count.

Whatever intelligence is, it cannot be the ability to search this space exhaustively. Nothing has the cycles for that. What intelligence has to be — what it observably is, in every domain we can study — is the ability to not search most of the space. To prune so aggressively that the remaining tree is small enough to walk in real time.

The chess grandmaster does not search faster than the beginner. They search a tiny fraction of the moves. Their priors prune ninety-nine percent of the tree before exploration even begins. The senior engineer does not write code faster. They write less of it, because their priors prune away the dead-end designs they would have chased a decade ago. The novelist does not generate prose faster than the apprentice. They reject more of what they generate, more quickly, with less anguish.

This is the lesson behind every domain of expertise: the work is not in the exploration. The work is in learning what not to explore.


The amateur sees infinite options. The expert sees three.

A faster feedback loop helps, because it lets you sample the tree more times per unit time and accumulate better priors (/blog/speed-of-the-loop.html). But the bigger lever, by far, is the priors themselves. A slow loop with sharp priors will out-walk a fast loop that explores blindly.

From Log to Knowledge

What is left after the search?

If all you kept was the trajectory — every move you considered, every dead end you backed out of, every detour you took before finding the path — you would have something like a context window. Replayable. Searchable. Of finite size. But not yet useful, except as raw material.

The next stage is induction and deduction. You look at the trajectory and ask: what general pattern would have let me skip the dead ends? What is the shortest program that would have produced this path? The answer to that question is knowledge — a compressed description of the trajectory that future-you can deploy as a prior.

This is Solomonoff induction in a different vocabulary. The shortest description of your trajectory is what you have learned from walking it. Memory is not the trajectory itself; memory is what survives when you compress the trajectory. The compression is the learning.

I think this is the deepest sense in which (/blog/software-is-memoization.html) understates the case. Software is memoization — cached outputs of cached subproblems. But cognition is one stage further. Cognition is the compiled grammar that produces the cache, not the cache itself. Software stores answers. Knowledge stores the routine for generating answers in a whole class of situations you have not yet encountered.


Knowledge is the shortest program that would have generated your trajectory.

The Compilation Hierarchy

Stack the stages.

L0: Raw exploration     — random walk; expensive search
↓ log
L1: Trajectory — context window; replayable
↓ induce / deduce
L2: Distilled knowledge — compressed insights; reusable
↓ pre-bake
L3: Compiled procedures — procedural skill; reflex-fast
↓ deploy
L4: Autonomous agents — compiled judgment; running 24×7

Each layer compresses the one above. Each layer trades flexibility for speed.

L0 is what you do the first time you encounter a problem you have never seen before. L1 is your notebook — the artifact that lets you replay the search later. L2 is what you have learned from rereading the notebook: patterns, rules, principles. L3 is procedural skill — the part of your expertise that no longer involves conscious deliberation, because the compilation has gone so deep that the procedure runs as reflex. L4 is when the procedure can run without you in the loop at all — when an agent can carry the compiled judgment forward and execute it on your behalf.

This is the interpret → JIT → compile pipeline that any systems engineer recognizes. Each stage trades upfront compilation cost for runtime savings later. Each stage assumes the world is stable enough that the compiled artifact will keep working. And each stage carries the same semantic content; what changes is only how much of the work has already been done by the time the artifact is invoked.

This is why expertise has texture. Every domain has a hot path — the situations you no longer have to think about — and a cold path, where you still explore from scratch every time. The boundary between hot and cold shifts as you compile more of the cold side over years of practice. The shape of that boundary, for any given practitioner, is a fingerprint of what they have lived through.

You as Sparse Parameters

Here is the part that takes the rest of the essay seriously.

Everything from L2 through L4 — knowledge, procedures, agents — is, in a precise sense, parameters about you. They are sparse, because you have not compiled an entry for every possible situation, only the ones that have shown up enough times to be worth crystallizing. They are learned, through the search-and-induce loop you have been running since childhood. And they are patchable: you can swap them out, override them, replace them when they stop working.

When you write a note with the intent that you will retrieve it later, you are not just storing information. You are emitting parameters. When you build a checklist, you are crystallizing a small piece of compiled judgment. When you train a coworker by writing down how you debug a particular system, you are extracting a procedure from L3 down to L2 down to L1 — a controlled decompression, so that the recipient can re-compile it inside their own head.

This frame collapses a distinction the AI field usually keeps separate.

People talk about RAG and fine-tuning as if they were different things. They are not. They are different points on a single spectrum: how much have you compressed your knowledge before runtime? RAG is "look it up at inference." Fine-tuning is "bake it into the weights." Prompt engineering is "write a one-shot patch." Agentic deployment is "hand the whole compiled procedure off and let it run unattended." Same spectrum, different operating points.

And the spectrum has a unifying frame: base model + your patches = your effective model.

The base LLM is a substrate — a general-purpose machine for moving through information. Your accumulated knowledge, distilled across L2 through L4, is the personalization layer. At any given inference call, some subset of your patches is recalled, weighted, and applied. The result is not the base model. It is a transient, dynamically patched version that exists for the duration of one query and then dissolves.

You are, in this view, a moving target. Your "self" at the cognitive output level is whatever set of patches is currently active. There is no central self that owns the model. Just patches, retrieved in different combinations for different situations.

This is the same observation that (/blog/no-such-thing-as-agent.html) makes from a different angle. Agents are not entities; they are compiled subtrees of judgment, parameterized by whatever context you load into them. Once you see knowledge, prompts, and agents as positions on the same continuum, the question "is this an agent?" stops being interesting. The interesting question is: at what compression ratio is this artifact useful, and what did you give up to get there?


There is no central self that owns the model. Just patches, recalled in different combinations.

What This Changes

Three things follow.

First, your interface to AI is not the prompt. The prompt is just the runtime call. Your interface is the slow accumulation of personal model patches across years of writing — notes, prompt libraries, procedures, agents. Each artifact is a weight. The prompt is what triggers the retrieval. The weights are what make the result different for you than for someone else asking the same question.

Most people think of personal AI as "remember my preferences." That frame is too thin. The right frame is harder: you are slowly fine-tuning the base model into something that approximates you — not by training it, but by accumulating retrievable patches that get applied at inference time. The base model never changes. Your effective model does, every time you add another piece.

Second, the boundary between "what you know" and "what your tools know" has quietly dissolved. It used to be sharp. Your head was the model; your tools were external aids. But once your tools include your notes, your prompt library, your agent fleet — and once those artifacts shape the output of every query you make — the inside-outside distinction stops carving the system at its joints. You and your tools are co-extensive. You are, in part, the patches you have written.

Third, this changes what learning means. Traditionally, learning was an internal event: you encoded a pattern in your head, you remembered it, you applied it later. Now there is a second loop running in parallel. You encode a pattern outside your head. A model retrieves it later. The model applies it. The encoded pattern is durable across model upgrades — your patches outlive the base model they were written for, the same way your notebook outlives the editor you wrote it in. (/blog/taming-randomness.html) — the declarative spine is exactly what makes patches portable across substrate changes.

What Unsettles Me

A book you read at twenty-five was, in retrospect, a piece of compiled judgment by someone else that you absorbed and made your own. The compilation process was slow. The author spent years on it. You spent weeks reading it. The yield was a set of patches that may have stayed with you for decades.

The new system is faster. You can compile a piece of judgment in an afternoon. You can distribute it as an agent. The artifact can be retrieved, applied, and replicated at machine speed. The yield is no longer just for you — it is deployable to anyone who can read your repo.

The question I find myself unable to answer is whether the compression rate keeps accelerating. If it does, the patches you write next year will matter more than the ones you have spent a decade accumulating inside your skull. The locus of your cognition will drift outward — from neurons to notes to agents to whatever comes after agents — and the slow biological work of carrying knowledge inside a single mind will increasingly look like an inefficiency, not a feature.

Or maybe it does not. Maybe there is a floor on how much can be compressed externally, and the patches you carry internally will always be load-bearing in a way that nothing external can replace. The thing that knows when to retrieve which patch is not itself patched. It cannot be, on pain of infinite regress.

What I can say is that the question is no longer abstract. Every time you write something down with the intent that an AI will read it, you are committing parameters to a personal model. The accumulation matters whether or not you notice it. The boundary between mind and substrate has been quietly redrawn while everyone was arguing about whether the agents were intelligent.

Share: X / Twitter LinkedIn HN
Share on X