I Started With Two Buckets. The Conversation Found Four.

A working session on command-vs-skill criteria turned into a four-type model for ambient intelligence — and a real folder reorganization that proved the model works.

I started this session with a simple question: when should something be a command and when should it be a skill? My LKB vault had 44 commands and 20 skills, classified mostly by feel. New ones kept getting added with no consistent test. I wanted criteria.

What I got back wasn’t criteria. It was an architecture.

By the end, I had four types instead of two — command, orchestrator, skill, agent — organized along a spectrum that I now think is the right way to reason about ambient intelligence in general. And I had a real demotion to show for it: a 126-line “skill” called latent-cartographer moved into the commands folder because, once the criteria were written down, it clearly didn’t belong where I’d put it.

Here’s what happened, step by step.

Step 1: The two-bucket draft

I asked Claude to write classification criteria for command vs skill. The first pass gave me what I expected: a quick test, a dimension-by-dimension table, promotion triggers, demotion tests. Solid work. Two types, clearly separated.

The quick test was the one I’d been missing:

Can a user accomplish this by pasting one prompt into a chat and reading the reply? Yes → command. No → skill.

That alone would have been enough to make the session worth it. I could already feel which assets in my vault would fail that test in the wrong direction. But I noticed something nagging at me: the table didn’t have language for the things that were one prompt but produced multiple distinct outputs, or were multi-step but were still triggered by name. The two buckets were leaking.

I dropped in feedback from a Perplexity research session I’d run earlier. It proposed four categories: commands, skills, orchestrators, and resources. The orchestrator category jumped out immediately — “a command that dispatches to multiple skills internally; user-facing invoker; delegates the work.”

That word didn’t exist in my vault. But the thing did. I had files in commands/ that were clearly orchestrators wearing the command label — content-pipeline, brand-html-output, things with named internal stages. I’d been calling them “complex commands” without a clean way to say what made them complex.

Concept #1: Name the type you keep calling “complex.” If you keep describing assets as a “complex X” or “long Y,” you have an unnamed category. The friction in the language is the signal — give it a word and the bucket appears.

I asked Claude to integrate the orchestrator category. Resources got folded in as references inside skill folders — they were already implicit in the structure. Two types became three.

Step 2: The question that forced the architecture

Then I asked the question that broke the model open:

Is the “orchestrator” essentially an “agent” as it’s defined?

I asked it because I genuinely didn’t know. Both can dispatch to other things. Both feel like “pipelines.” The agent definition in my existing detection heuristics talked about autonomy, isolation, clean handoffs — orchestrators kind of did that too. I wanted to know if I’d just renamed something that already existed.

Claude’s answer was the inflection point of the whole session:

Both can dispatch to skills internally. The difference is whether the user is steering or has stepped away.

Orchestrator — user-invoked, foreground, user can observe/redirect. Agent — autonomous, isolated. Handed a task and runs independently of the parent conversation.

That single distinction — who’s in control during execution — was the thing I hadn’t articulated. Once it was said, I realized I had two separate questions hiding in the classification problem:

How does the asset get triggered? (By you, or by Claude noticing context?)
Who’s present while it runs? (You watching, or you stepped away?)

Two questions. Two axes. Four quadrants.

Concept #2: Force the distinction by asking “is X the same as Y?” When two types feel close, don’t trust your intuition that they’re different — write the sentence that explains why. If you can’t, they might be the same thing. If you can, the sentence is the architecture.

Step 3: The framing that made it stick

I told Claude this added “a nice new perspective/distinction to the ambient intelligence architecture too.”

That sentence wasn’t planning. It was something I half-noticed while replying. But once I’d said it, I couldn’t un-see it. The four types weren’t just a classification scheme — they were a spectrum. And the spectrum was about how much cognitive load AI carries on your behalf.

Read the diagram left-to-right and top-to-bottom:

Top-left (Command): You invoke it, you watch it. Maximum cognitive load on you. You decide what to run, and you’re present to act on the output.
Top-right (Skill): Claude triggers it by reading context. You’re still in the loop — replying, steering, integrating — but you’re not the one reaching for the right tool. The triggering load moves to AI.
Bottom-left (Orchestrator): You invoke it by name, but once it’s running it goes through multiple stages without needing your input on each one. You started it; it’s running mostly hands-off.
Bottom-right (Agent): AI triggers it. AI runs it. You stepped away. Comes back when there’s a result.

The further right and down you go, the more of the work happens without you holding it. That’s what “ambient intelligence” actually means in practice: intelligence that operates alongside you, not just on demand.

Concept #3: Read your AI setup as a position on a spectrum, not a list of files. Where each piece lives on the user-present-to-user-absent axis tells you what kind of leverage it gives you. Most setups are heavily clustered top-left. Moving things right and down is the actual work of building ambient intelligence.

Step 4: Pushing the model into the extraction pipeline

The classification was useful in the abstract. But my vault has a transcript-processing pipeline that extracts reusable assets from every mastermind session — and that pipeline only knew about three asset types (prompt, skill, agent). It would never find an orchestrator because nothing in its detection heuristics named the pattern.

So I added orchestrator-specific signals to the detection rule. Things like:

“I just trigger it and it handles everything” / “I run X and it produces A, B, and C” — named pipeline invoked as a unit → orchestrator signal

“I built a workflow where…” / “My [named] pipeline…” — someone has already named and systematized a multi-stage process → orchestrator signal

And I sharpened the agent signals to distinguish them from skills:

“I let it go and come back to it” / “It runs in the background” — user steps away during execution → agent signal

“While I was doing X, it was doing Y” — parallel/async execution, user not steering → agent signal

The general principle: orchestrators talk about invocation as a unit; agents talk about absence.

I also added a type-disambiguation section at the bottom of the rule — four questions to pin the type at extraction time:

Does the user invoke it by name, or does Claude load it by context?
Is it one prompt → one output, or multiple stages → multiple outputs?
Is the user present during execution, or does it run and return?
Does it produce one cohesive artifact, or several distinct ones?

Concept #4: Don’t add a category to the model without adding signals to your detector. A taxonomy that lives only in a rule file will never grow. The categories have to be findable in the wild — or every new asset will get sorted into the old buckets you used to have.

Step 5: The worked example

The model was now four types. The detection heuristics could find all four. The question was: does any of this hold up under load?

I had Claude audit the vault. The skills folder, in particular.

The demotion test from the rule:

If a skill’s SKILL.md is under ~150 lines, has no references/, no scripts, no branching, and reduces to “send this prompt with $ARGUMENTS” — demote it.

We walked through each skill folder. Most were legitimately skills — multi-step, stateful, file-aware. One stood out immediately: latent-cartographer. 126 lines. A single SKILL.md with no references, no scripts. Six “phases” that all executed inside one Claude response. Even the invocation pattern (/lsc, /lsc-lite, /lsc-deep) gave it away — those were named user invocations, not context-matched skills.

Latent-cartographer before and after the demotion

The move:

Created wiki/mastermind/commands/latent-cartographer.md with command frontmatter (added type: command, trigger: /aimm:latent-cartographer, the source attribution).
Body content stayed almost identical. Six phases, the probe library, the synthesis operators table, the quality gates — all kept verbatim. Only the framing language (“What This Skill Does”) got tightened.
Found the one wikilink that pointed at the old skill location — line 187 of 2026-03-19_Mastermind.md — and rewrote it to point at the command path. Annotated the link with a note about why: “demoted from skill: single-generation protocol, no file I/O.”
Deleted the skill folder via git rm.

Git tracked the whole thing as an 84% rename, which was satisfying — the file kept its identity, just changed homes.

Concept #5: Run your taxonomy against a real audit before you trust it. Criteria that sound clean in the abstract often miss real cases. The worked example is the difference between a model you wrote and a model that works.

Step 6: What changed downstream

The demotion was small as a code change — one new file, one updated wikilink, one deleted folder. But the implications rippled:

The asset-detection-heuristics in the transcript-processing pipeline now had orchestrator language. Future sessions where someone describes a multi-stage pipeline as a single named invocation will get classified correctly.
The process-transcript skill itself was updated to delegate classification to the new rule. The single source of truth lives in one file now, not embedded as inline instructions.
The rule got linked from both the root vault CLAUDE.md routing table and the mastermind-specific CLAUDE.md — discoverable from either entry point.
The vault state changed: 45 commands, 19 skills.

One demotion, sure. But the architecture now has a path forward for the next 50 assets we extract.

What this unlocks

The underlying pattern here isn’t about commands and skills. It’s about what happens when you take a fuzzy classification problem and add a second axis to it.

I started with one axis: invocation shape (one-prompt vs. multi-step). That gave me two buckets — and the two buckets leaked, because some one-prompt things produced multiple outputs and some multi-step things were really just templated prompts.

The second axis — who’s present during execution — sharpened the model from two types to four. It also gave each type a purpose in a larger system. Commands give you direct invocation. Skills give you context-matched capability. Orchestrators give you named-pipeline leverage. Agents give you autonomous parallel work.

You can apply this same move to almost any classification you’ve been fighting:

Content types: If your editorial buckets keep collapsing into each other (“article” vs. “post” vs. “essay”), find the second axis. Maybe it’s audience-readiness (cold reader vs. existing audience). Maybe it’s depth (single-claim vs. multi-claim). The second axis surfaces categories you’ve been merging.
Client engagements: “Consulting” vs. “coaching” is two buckets. Add an axis like expertise-transfer vs. capability-building and you’ll find your actual offerings — diagnostics, frameworks-then-go, embedded engagement, accountability containers.
AI workflows: “Automation” vs. “augmentation” is two buckets. Add an axis like foreground vs. background and you’ll find the four modes I just walked through — except now you can see they apply to your whole AI stack, not just your skills folder.

The move: when two buckets keep leaking, the missing distinction is a second axis, not a third bucket.

Key takeaways

Two-bucket models almost always have a hidden second axis. When categories keep collapsing or leaking, you don’t need more categories — you need to surface the axis you’ve been compressing onto one dimension.
For AI assets, the two axes are: how does it get triggered, and who’s present while it runs. That gives you Command, Orchestrator, Skill, Agent. Most setups today are heavily concentrated in the top-left (commands you invoke and watch).
The demotion test is the highest-ROI audit you can run on your AI setup. If a “skill” folder reduces to one prompt with arguments and no file I/O, it’s a command in expensive packaging. Move it.
Don’t ship a new taxonomy without updating your detection. The rule has to flow into the pipeline that extracts assets — otherwise you’ll keep finding only the old types.
Architecture decisions compound. One demotion is small. A consistent classification model means the next 50 assets you extract land in the right folder by default.

How to start

List your current AI assets. Pull up whatever folder holds your commands, skills, prompts, custom GPTs, agents. Make a flat list.
Run the quick test on each one. Can a user accomplish this by pasting one prompt into a chat and reading the reply? Yes → command-shaped. No → skill-shaped.
Run the second-axis test on the skill-shaped ones. Does the user invoke it by name, or does the AI load it by context? Named → orchestrator. Context-matched → skill.
Find your first demotion candidate. Look for a “skill” that has no references, no scripts, no branching — just one long prompt template. Move it to commands and update any links pointing at the old location.
Add the missing signals to your detection. If you have a transcript-processing or asset-extraction workflow, make sure it can recognize orchestrators and agents — not just commands and skills. The categories have to be findable in the wild, or the model never grows.

The whole audit takes about 30 minutes on a medium-sized vault. The folder doesn’t get smaller. It gets coherent.

— Lou PowerUp Coaching / AIMM

Behind the Article

The most teachable moment in this session wasn’t any single output — it was the single question “Is the orchestrator essentially an agent as it’s defined?” That question forced the spectrum into existence. I leaned into it as the article’s inflection point because the same move (asking whether two things you’ve named are actually the same) is the meta-lesson that travels well beyond AI architecture.

I cut a long section on the detection heuristics work (orchestrator/agent linguistic signals) — the article is already dense, and that material lives in the rule file for anyone who wants to extend the pipeline. The latent-cartographer demotion got more space than its size warrants, because it’s the only place in the piece where the model meets a real artifact and survives contact.

The single most valuable thing I could add before publishing: a short worked example of someone else’s setup (an AIMM member’s command folder, redacted) running through the same audit. The current article shows the principle landing on my own vault — which proves it works for me. A second worked example would prove it generalizes.