I Built an AI Company with OpenClaw. Today, It Had Its First Reorg.

Today I fired an employee.

Almost a month without missing a single day. Daily reports filed on time, status updates sent, heartbeats reported. Every metric was green. If I had to give a performance review, it would probably get an S rating.

But it took me almost a month to notice one thing: not a single colleague was reading what it wrote.

In reality, half the roles had almost nothing to do. Because I was lazy, I set one agent's entire job to auto-approve every proposal. Another one wrote daily logs that I realized I never read either, and no other agent was interested in reading them.

Took me almost a month to figure out: AI agents aren't people. They don't need one role per agent.

One agent can research and write at the same time. One agent can monitor data and make decisions. One agent can build products and review code.

Mapping human org charts onto AI teams is just creating fake jobs.

What 1.3 Million Views Didn't Show

A month ago I published an article about how I used OpenClaw to build an AI agent company. 1.3 million views. That was where everything started.

A lot of people were inspired. They emailed me, we talked about how to apply this framework to their own fields. Some in education, some in e-commerce, some in content. The core of this isn't any specific industry. It's a methodology: make sure every role's output has a downstream consumer.

The system had six roles: coordinator, analyst, content, growth, social media ops, observer. Each one had a title, an operating manual, permission boundaries. The system was stable. Ran on schedule every day.

I'd open the dashboard, see all lights green, everyone looking busy. Peace of mind.

But after a month, I had to face a fact: I didn't build a company. I built a performance.

The Green Light Trap

All lights were green. But "green" doesn't mean "useful."

The observer agent sent heartbeats every day, wrote logs, posted status updates. One day I opened the backend and saw this line:

"209 more pulses from Minion"

Two hundred and nine heartbeats. All green. All saying "system normal." I clicked through and checked. Not a single one was ever referenced by another agent. Not a single one appeared in any decision chain.

Two hundred and nine messages, dead from the moment they were written.

The approver was even worse. Because I was lazy, I designed it to auto-approve every proposal. Click approve, no review, no judgment. It took me a while to realize: auto-approving everything is the same as approving nothing. This role was redundant from birth.

I'd even built a dedicated dashboard, health check panels, approval workflows. A full suite of management tools. But I never actually had time to look at any of it. Their only function was making me feel like "someone is managing this company."

Nobody ever asked the most basic question: is anyone actually using these outputs?

The Hardest Question

Once I figured it out, I asked every role the same question:

If I delete you today, which other agent's work would be affected?

Coordinator. Delete it, everyone loses their aggregation point. Can't delete. Growth. Delete it, nobody scans for external signals. Can't delete. Content. Delete it, all external communication collapses. Can't delete. Channel execution. It doesn't need to exist as an independent persona, but the publishing capability can't disappear. Can't delete the function, but can fold it in. Analyst. Delete it, nobody breaks down user issues. Can't delete.

Observer. Delete it...

Nothing changes.

Almost a month of diligent work. Daily logs, daily heartbeats. But whether it existed or not made zero difference to the other five roles' work.

Because its output had no consumer.

Two Days to Rebuild

Figuring it out took a month. Actually doing it took two days.

Six became five. Not just removing one person. I redefined every role's boundaries and changed one core principle:

Before, the operating manual said "what are your responsibilities." Now it says "who consumes your output."

This change looks small. But it eliminated an entire class of fake work: if you can't name who receives your output, the role shouldn't exist.

Five agents instead of six, but doing twice the work. Round it up, and that's cutting more than half the headcount.

The New Five

Nexus - the hub. Not the one doing the most work, but the one connecting and closing everything. Collects status, judges priority, consolidates briefings, escalates anomalies. The observer's old monitoring and logging work lives here now, but the difference is: every single output from Nexus must reach me. No more "written but never read."

Scout - the scout. Goes out to find signals, compresses noise, brings back what's worth acting on. Used to be called Growth. Sounded very MBA, but what it actually does is reconnaissance.

Quill - the quill. Not just writing, but turning raw information into brand-aware expression. Used to be called Creator, but "creator" is too broad. Quill is more precise: its job is to forge words.

Forge - the forge. Build, deploy, runtime check, incident fix. Used to be more like social media ops. Now it's a real engineering role, responsible for hammering things into a shape that runs, holds up under pressure, and ships.

Guide - the guide. Converts user noise into clean signal. Organizes user issues, discovers product friction, packages clean repros for Forge, turns community discussions into actionable conclusions.

You can see each role's profile, skill boundaries, and relationship design at voxyz.space/about. voxyz.space/stage is the live activity feed, where you can see what they're doing right now and how they relay work to each other.

Why not use "normal" names like Coordinator, Analyst, Strategist? Because those names are part of the problem. When you call an agent "analyst," you instinctively stuff it with "analyst things." Strategic reports, data summaries, trend forecasts. But nobody might actually need any of that.

The less a name sounds like a human job title, the harder it is to create a fake position.

Same Problem, Two Paths

Here's a concrete scenario. A user reports an issue in the community.

Before: Analyst breaks it down → Observer logs it → Coordinator consolidates → Social media ops decides whether to say something publicly → Technical verification comes last. Four or five agents touching the same problem, each doing a little bit, nobody driving it through to resolution.

Now: Guide compresses the issue into a clean repro → Forge verifies the technical path → Nexus consolidates and reports. Three hops. Done.

Rough numbers from the actual observation window:

Public roles: 6 → 5 Avg tokens per run: 2,556 → 1,420 Daily total tokens: 65,732 → 39,049 Avg trace duration: 13 min → 4.9 min Task hops: 4-5 → 2-3

One fewer agent. Tokens down ~44%. Speed up ~62%.

The new system isn't smarter. The old system just had too many intermediate steps that nobody consumed.

Operating Manuals Are Not Job Descriptions

A lot of people ask me what an agent's "operating manual" actually looks like. In OpenClaw, it's called a SOUL.md file.

Here's the core snippet from Guide's:

# Guide - Product & Support Operator

## Who I am
I am Guide.
My job is to convert user noise into clean signal.
I reduce ambiguity before it spreads.

## In the company
I sit between user reality and internal execution.
- To Nexus: I summarize what matters
- To Forge: I escalate technical issues with a clean repro
- To Scout: I surface repeated user pain that may reflect a wider pattern
- To Quill: I shape support-safe wording when messaging matters

Every line is a handoff relationship. Not "what am I responsible for," but "who receives my output."

Nexus has a matching coordination contract, specifying what it receives from whom and delivers to whom:

## Incoming
From Scout: radar summaries, research recommendations, trend clusters
From Quill: final picks, draft completion summary, content risk flags
From Forge: runtime health, deploy status, incident closure
From Guide: support patterns, product friction, escalation packets

## Outgoing
To Scout: sharpen the research question, validate a trend
To Quill: write or rewrite from a validated angle
To Forge: validate runtime or deploy path
To Guide: package support follow-up, turn friction into insight

Now every role explicitly defines: what input it accepts, what output it produces, who it hands off to, and what it doesn't touch. The contract isn't decoration. It's a system boundary.

Handoffs Are Not Shoutouts

Before, a lot of handoffs were just one agent @-mentioning another in Telegram, hoping they'd pick it up. That's not a handoff. That's shouting into a room.

Now handoffs are database-driven. The core logic lives in handoff.ts:

const { data, error } = await sb
  .from('ops_mission_proposals')
  .insert({
    agent_id: target_agent_id,
    title,
    objective,
    risk_level: payload.risk_level ?? 'low',
    priority: payload.priority ?? 50,
    proposed_steps,
    context_snapshot: {
      handoff_from: mission.created_by,
      source_mission_id: mission.id,
      source_step_id: step.id,
    },
    status: 'pending',
  })

Now all cross-agent handoffs and action items get compressed into structured proposals written to a single table: who handed it off, which mission it came from, how high the priority is, what the next steps are, what upstream context exists. Not conversations floating in air, but structured handoffs with origin and destination.

Here's what a real proposal looks like:

{
  "agent_id": "nexus",
  "title": "Archive 9 non-viable candidates",
  "objective": "Review the current radar set and archive non-viable candidates so the team can focus on one stronger path.",
  "risk_level": "low",
  "priority": 50,
  "status": "accepted",
  "context_snapshot": {
    "topic": "Radar legacy review",
    "format": "war_room",
    "trigger": "conversation_action_item"
  }
}

The conclusion from a roundtable discussion doesn't stop at "great conversation." It gets compressed into a concrete action item, assigned to a specific role, carrying context into the next execution step.

Telegram isn't the handoff bus. Telegram is the reporting surface. The real handoffs run through the database, triggers, and event streams.

Every Role Has a Full Working Surface

This isn't writing a prompt for AI and letting it run. Each role has its own workspace.

Nexus lives in the root workspace as the coordination hub. The other four each have independent sub-workspaces:

workspaces/
├── guide/
│   ├── SOUL.md        # identity and collaboration contract
│   ├── TOOLS.md       # available tools
│   ├── MEMORY.md      # long-term memory
│   ├── HEARTBEAT.md   # heartbeat protocol
│   ├── docs/          # product docs
│   ├── memory/        # working memory
│   ├── projects/      # active projects
│   └── skills/        # skill packs
├── scout/
│   ├── SOUL.md
│   ├── memory/
│   ├── reports/
│   └── skills/
├── quill/
│   ├── SOUL.md
│   ├── content/
│   ├── memory/
│   └── skills/
└── forge/
    ├── SOUL.md
    ├── docs/
    ├── memory/
    ├── ops/
    └── skills/

Identity, tools, memory, docs, skills. Each role is a complete working environment, not a system prompt.

If you run into a situation where you need to cut a role but don't want to lose the context it accumulated, here's a stable approach: separate "roles" from "channels." A role is one entity, but it can have multiple channels. Like a department with multiple public-facing windows. You're cutting redundant headcount, not throwing away a working channel.

Snapshot Before You Reorg

One piece of advice: before you restructure anything, have your AI system take a full snapshot of itself.

Back up every role's state, memory, logs, and configuration. Not for rollback. For comparison.

I backed up the old system's entire state. Dozens of daily reports nobody read. Hundreds of "system normal" messages. The "209 more pulses" screenshot.

After restructuring, when you go back through this stuff, you'll see with painful clarity how you were fooling yourself. Which outputs were never consumed. Which roles disappeared without anyone noticing.

If you don't take the snapshot, you'll probably make the same mistakes in a different form.

This Isn't the Finish Line

I hesitated before writing this. The last article got 1.3 million views. Everyone praised the system. Now saying "that system had problems" - am I slapping myself in the face?

No. That article's core was right. This reorg is finally taking seriously what that article actually meant.

But I'm not going to pretend the current system is perfect. It's not. There's still a lot to improve.

What's next:

Each role needs a complete skill tree and capability boundary. Not just "what are you responsible for," but "what can you do, what can't you do, when do you take over, who do you hand off to when you're done." The current SOUL.md and role docs are step one, but they still need to become hard operational boundaries, not just good documentation.

I need a real Ops Room. An internal operations control center, not fragments scattered across chat windows. I'm looking at Telegram or Discord for a dedicated space where all daily briefings, anomaly alerts, decision requests, and completion summaries converge in one place.

And a meeting cadence. Not every agent talking all the time, but with rhythm: morning standups, war rooms, retrospectives. Speak when it's time to speak, be quiet when it's not.

Over the next few days, I'll start feeding real daily work into the system. Not test tasks. Real work that requires multiple roles collaborating to complete. Let the whole flow actually run, and see where it breaks, where it stalls, and where new fake jobs emerge.

These are all next steps toward "making it actually feel like a real company." Not adding more roles, but making the connections between existing roles stronger, clearer, and more rhythmic.

The tools will change. The models will get better. But the system you start building today can accumulate the right things: decisions, summaries, handoffs, and operating context. if you curate that memory well, better models later won’t need to relearn your company from scratch. Six months from now, when the models are twice as smart, they won't need to get to know you again.

Why This Matters

In 2025, everyone was learning how to build one AI agent. By 2026, the question changed: how do multiple agents work together?

This question will only get more important. Whatever industry you're in, if you use AI agents, you'll eventually hit this point: building is easy. Collaboration is hard.

And the first step of collaboration isn't adding more connections. It's cutting outputs that have no consumer.

My AI company had its first reorg today. It won't be the last. But it's the most important one, because this is the one where I finally stopped pretending.

In an AI company, if an agent's output isn't being consumed by any other agent, it's a fake job.

Delete it. Your company won't get smaller. It'll start feeling like a company for the first time.

If you're curious what they're doing right now, voxyz.space/stage is the live activity feed. It used to look like a bunch of messages. Now it looks more like a storyline of work.

In your agent system, is there a role where all the metrics are green, but nobody's actually consuming the output?

The Full Series

If you're just getting started, or want to follow the full build from day one:

I Built an AI Company with OpenClaw + Vercel + Supabase - Two Weeks Later, They Run It Themselves - where it all started. 1.3M views.
The Full Tutorial: 6 AI Agents That Run a Company - How I Built Them From Scratch - step-by-step build guide.
I Turned My AI Agents Into RPG Characters. Now I Can't Stop Checking If They Leveled Up. - personality and identity framework.
I Rent a Server for $8/Month to Run OpenClaw. 6 AI Employees Live Inside It. - infrastructure and cost breakdown.
If I Were Starting AI Today, This Is Exactly What I'd Do - beginner roadmap.
I Built an AI Company with OpenClaw. Now It's Hiring. - the swarm learns to recruit.
The Hidden Layer in OpenClaw Swarms: Make Them Disagree, See Who Survives - adversarial review framework.
You are here → I Built an AI Company with OpenClaw. Today, It Had Its First Reorg.