One Brain, Many Doors

Designing a context manager that lets multiple AI agents — different models, different CLIs, different days — work on the same project without losing the thread.

The store is the truth. Open terminals are just doors into it.

That line came late in the design session — but once it landed, it collapsed three weeks of potential over-engineering into four markdown files and a decision that turned out to be obvious. Here’s how it got there, and why the first instinct would have built something complicated, fragile, and broken.

The ambition: write a PRD with Claude, switch to Codex to review it, switch to Gemini to stress-test it, come back to Claude to incorporate the findings. Each model brings something different. The work shouldn’t have to start over each time the window changes.

The first instinct is to ask: how do we make the models talk to each other? That’s the wrong question. And chasing it would have produced something complicated, fragile, and Claude-specific.

Step 1: The wrong frame — agents talking to agents

If you start from “agents need to share context,” you end up thinking about message buses, real-time streams, MCP plumbing, orchestrator daemons that hold conversation state and replay it into each new model. All of that is real engineering. None of it would have worked.

It wouldn’t have worked because the agents are in different processes, different runtimes, sometimes different sessions hours apart. They have no shared memory. Forcing them to act like they do means building a synchronization layer none of the tools want to support, then watching it break the first time a model is updated, a CLI changes its interface, or a session crashes mid-handoff.

The premise itself was wrong. The agents don’t actually need to talk to each other. They need to leave each other notes.

Concept #1: When state lives outside the actor, the system stops caring who acts. The hardest problem in multi-agent coordination isn’t communication — it’s persistence. Solve persistence with a shared, durable store, and the communication problem dissolves. Whichever agent shows up next reads the store, picks up where the last one stopped, and writes back what it did. The store becomes the protocol.

Step 2: The store

The architecture that fell out was almost embarrassingly simple. Four markdown files in a directory called .context/ at the project root, plus a tiny lock file:

File	Purpose
`CHARTER.md`	The static protocol: how this team works (read once per agent)
`STATE.md`	The live board: active objective, task queue, who’s on what
`LOG.md`	Append-only handoff journal — what just happened
`DECISIONS.md`	Append-only ADR-lite — why it’s like this
`lock`	Short-held mutex during STATE/DECISIONS mutations

That’s the whole brain. No database, no service, no MCP server, no custom protocol. Markdown is agent-native — every model reads and writes it without translation. Git is already there for versioning, attribution, and “what changed.”

The session arrived at this through a deliberate set of forks rather than guessing. Eight architectural questions over two rounds — where does the store live, what’s the canonical task source, how are tasks routed, how deep is mid-task resume, how is the ritual enforced, do we trust prior “done” marks, what’s the scope, can a fresh agent overturn a prior decision? Each fork closed off a wrong path before it could grow.

The forks that mattered most:

Markdown, not a database. Two sources of truth always drift. One canonical board, in the format every model already speaks.
Append-only journals. LOG and DECISIONS never get rewritten. That makes them conflict-free under concurrent writes — agents can both append at the same time without stepping on each other.
Lock semantics measured in seconds, not sessions. The lock is held only while mutating STATE, then released. A “session lock” would block parallel work; a sub-second lock has near-zero contention.

Step 3: What that line did to every later decision

Four agents reading and writing the same .context/ store — no two agents talk to each other; they all talk to the store

The store is the truth. Open terminals are just doors into it. That wasn’t metaphor — it was load-bearing. It meant:

A model has no memory of what it did in another window, and that’s fine — the store remembers.
A second window opened mid-session doesn’t need to be “synchronized” with the first — it reads the store.
A fourth window opened tomorrow morning to ask “what’s happening with this project?” reads the same store and gets the same answer.
An orchestrator agent that drives the work isn’t a different architecture from a human doing the same — both are just doors. Same protocol.

This last point became important later, when the question came up: if I build an orchestrator agent that auto-spawns the CLIs, is that a different system? No. The protocol’s contract was always with the board, not with the human. An orchestrator agent is Lou-as-a-script. The board doesn’t know the difference.

Step 4: The stress test that fixed the architecture before it was built

After the spine was drafted, the session ran a deliberate stress test — eleven concrete scenarios where multiple models worked the same project at the same time. Most of the scenarios passed cleanly. A few exposed real holes:

Scenario	Hole	Fix
Reviewer reads the WIP working tree of another agent	“Is this final or mid-edit?” ambiguity	Review against commit SHAs, never working trees. Every Ready-for-Review entry records its commit.
Doer marks task ready; reviewer has no entry point	Manual queueing every time	Auto-pair review tasks. Check-out spawns a paired `T-NNN-R Review @ <SHA> — #review` task.
A session crashes leaving a task stuck in Doing	Phantom ownership	`started:` timestamp + 4h staleness check. Surfaces to “Needs Lou” on the next check-in.
Reviewer reads the doer’s LOG before the diff	Anchoring on the doer’s framing	Blind-first-read discipline. Read the task definition, then the diff blind, then (only then) the LOG.
Multi-model reviews are stylistically inconsistent	Findings noise	A finding template with severity / location / evidence / suggested fix. Codified in CHARTER.
Multiple worktrees fork the brain	Silent divergence	`.context/` lives in the main repo, symlinked into each worktree. One brain, always.

The point isn’t that these fixes were clever. The point is that running the architecture as a thought experiment against real concurrent scenarios — before any code — surfaced six real failure modes, each of which got a small, localized, additive fix. None required reshaping the protocol.

Concept #2: Stress-test the architecture before you stress-test the code. Sit with the design and walk it through five or six worst-case scenarios out loud. Watch where it bends. The fixes you make at that stage are tiny — a column added to a table, a sentence in the charter. The same fixes after the system is built are a refactor.

Step 5: The runtime — making the ritual unskippable

A protocol that depends on agents remembering to follow it is a protocol that fails the moment a model is forgetful, rushed, or distracted. The session split the protocol into two layers:

The store — .context/ in the project. Markdown, git, agnostic.
The runtime — a session skill that automates check-in, optional mid-session checkpoint, and check-out.

For Claude Code, the runtime is wired with hooks at the harness level — the SessionStart hook injects STATE and LOG into the model’s context before the model sees the user’s first message. The Stop hook nags at exit. The harness runs the ritual, not the model. Even a forgetful agent can’t skip the read.

For Codex, Gemini, and other CLIs without a hook system, the protocol still works — they read AGENTS.md and GEMINI.md files at startup (their respective always-loaded instruction files) and follow the same check-in / check-out steps. Less guaranteed than a hook, but a tier-2 fallback that still gets the job done in 90% of sessions. For tier-3 (open-source LLMs, no instruction file), a wrapper shell script (context-run.sh) wraps the CLI invocation with explicit check-in injection and an opt-in check-out prompt at exit.

Concept #3: Enforcement is a tiered design choice, not a single mechanism. The same protocol can be enforced at three different levels: harness-enforced (hooks — near-certain), instruction-loaded (read-on-startup files — good), or wrapper-enforced (shell script around the CLI — variable). Match the tier to the tool. Don’t try to make every tool work the same way.

Step 6: The walk-through that proved it works

To validate the whole thing, the session walked a concrete cross-CLI pipeline frame by frame — the PRD example. Claude drafts. Codex reviews. Gemini stress-tests. Claude addresses findings. At the end, the LOG read like a complete narrative — four entries from three different models, one coherent story.

The friction points named honestly: - Adversarial tasks don’t auto-spawn (only reviews do) — that became agents.yaml, a routing table that maps focus tags to CLIs and serves both human and script orchestration. - Codex/Gemini check-out is two steps via the wrapper (exit, then re-invoke with the check-out prompt) — acceptable given the tier. - Cross-window chat memory is private — durable insights must be promoted to DECISIONS or they die with the session. The check-out ritual now prompts for this explicitly.

The walk-through also revealed the architecture’s bigger insight: human orchestration and agent orchestration are the same architecture. The board is the shared rail. Lou drives manually, or orchestrate.sh drives automatically — same reads, same writes, same lock, same routing rules. They can even hand off mid-pipeline, because the board is the handoff.

Step 7: Distribution

The final move was packaging. The original design split things into two plugins — a project-scoped installer and a user-global session skill — but a follow-up clarification collapsed that. The session skill only makes sense in a project that has .context/ set up. So its scope is project-local too, and the right package is one plugin with two skills:

context-mgr — the installer skill, install.sh, and all protocol file templates
session — check-in, optional mid-session checkpoint, and check-out

The install story that fell out of that consolidation:

# one-time: get the plugin (do this once per machine)
/plugin install loudalo/context-mgr

# per-project: scaffold the protocol files
/install-context-mgr                   ← installs into current project
/install-context-mgr /path/to/project  ← installs into a specified path
/install-context-mgr --check           ← dry run first

/install-context-mgr is a user-global slash command (~/.claude/commands/install-context-mgr.md) that calls the plugin’s install.sh and passes any flags through. One command, any project, no plugin scope confusion.

The narrower the install footprint, the cleaner the context. Skills that only activate in projects that opted in are skills that can’t pollute unrelated sessions.

The lesson, generalized

The session’s core move wasn’t technical. It was an inversion of framing.

The default frame for “many agents, one project” is agent-to-agent coordination — message passing, shared memory, real-time sync. That frame is exhausting to implement and exhausting to maintain. It assumes the agents are the load-bearing actors.

The frame that worked is agent-to-store coordination. The agents don’t need to know about each other. They need a shared truth that persists between them. The store is load-bearing; the agents are interchangeable doors.

Once you make that flip:

Multi-model is the same problem as single-model-across-sessions. Both are persistence problems disguised as identity problems.
Orchestrator agent is the same as human orchestrator. Both write to the same board.
Different CLIs is the same as different days. The store doesn’t care who’s reading.

That generalization is portable beyond context managers. It applies to any system where multiple processes need to coordinate without coupling: build the durable truth first, then watch the coordination problem dissolve into reads and writes against it.

The store is the truth. Open terminals are just doors into it.