Agent OS

WorkMan

Lightweight operating system for multi-agent AI work. Structured collaboration, governed prompts, and a runtime that measures and improves itself.

Runtime roles

Authority
  • Principal — human or autonomous agent; owns policy and bounds
  • Director — governs work; manual, assisted, or fully autonomous
  • Manager — observes, aggregates, plans
  • Operator — executes commands, mutates state
Execution
  • Seats — configurable runtime slots; any engine via adapter
  • Workers — do the work inside a bench
  • Review stance — a worker checking a peer's pass (not a separate role)
  • Loop — automated prompt delivery and state monitoring

Overview

Core ideas

Orchestration

Multi-agent collaboration

Isolated workbenches with built-in peer review. Parallel execution across tasks. Pluggable agent seats assigned by capability.

State

Stages & runtime truth

Every task moves through defined stages with explicit transitions. One canonical state file per task tracks stage, round, active agent, and pending approvals.

Intelligence

Context engineering

Prompts assembled from live runtime state and policy constraints rather than static templates. Optional Cortex retrieval augments intake when configured.

Fitness

Qualification registry

Measured fitness per agent per task type, recorded in an evidence-backed registry with explain surfaces. The registry emits advisory warnings today; automatic assignment changes are planned, not shipped.

Integration

Dev infrastructure

Git workflows, PR management, workspace orchestration, and bench coordination. The runtime connects to the tools and processes that real multi-agent development depends on.

Adaptivity

Adaptivity

Pluggable adapters for different AI engines. Configurable seats, policies, and automation levels. The runtime adjusts to what's available and what works.

Design

How work gets done

  • Design before code — write down what should be true and what would prove it wrong before implementation starts
  • Staged work cycles — intake, planning, implementation, verification, adversarial review, closeout
  • Multi-perspective analysis — parallel agents explore from different angles, cross-check, and converge on a result
  • Built-in peer review — one agent works, another reviews with reject/fix cycles
  • Automated consistency checks — cross-references, timestamps, and artifacts validated at each stage boundary
  • Structured prompts — every instruction built from current state, policy constraints, and the specific task at hand
  • Fitness measurement — track which agents do well at which tasks and use that evidence for future assignment

Development

What we're building

  • Feature data pending.

Explore

Links

Requirements

Dependencies

  • bash 5.0+ — runtime shell
  • git 2.30+ — worktree isolation and branch management
  • tmux 3.2+ — agent session hosting and seat panes
  • Node.js — launcher and operator view surfaces
  • Agent CLI — Claude Code, Codex, or any engine with a CLI prompt interface
  • Cortex (optional) — prompt intelligence service for retrieval-augmented guidance

Architecture

Local-first, hostable, measurable

Runs on your machine. Manages agent sessions through tmux. State in plain files. Prompts assembled from retrieval and live context. Fitness measured and fed back. Projects to hosted views when ready.