Queue system¶

Implemented Designed

Source: doc/governance/Agent_Work_Queue.yaml (16,178 lines, 406 tasks) · doc/governance/Agent_Queue_Structured_Store_v1.md · doc/governance/Task_Authoring_Standard.md · scripts/ci/agent_queue_validate.sh · scripts/ci/agent_queue_git_consistency.sh

The platform's work queue is a single YAML file today, with a designed evolution to a split YAML-plus-SQLite store. This page covers the file format, the task lifecycle, validation, and the evolution path.

Current state — single YAML¶

doc/governance/Agent_Work_Queue.yaml:

16,178 lines
406 tasks
3 states: todo, in_progress, blocked, done
3 roles: A-backend, B-ui, C-ops (plus D-arch / E-governance review lanes)
Schema-validated by scripts/ci/agent_queue_validate.sh
Git-consistency-checked by scripts/ci/agent_queue_git_consistency.sh

File shape¶

erDiagram
    queue ||--o{ task : "contains"
    task ||--o{ acceptance_check : "must pass"
    task ||--o{ depends_on : "blocked by"

    queue {
        int version
        date updated_at
        list states "todo, in_progress, blocked, done"
        list roles "A-backend, B-ui, C-ops"
    }
    task {
        text id PK "X-AREA-FEATURE-NNN"
        text title
        text role
        int priority "1 highest"
        text status
        text owner
        text branch "agent/<lane>"
        date completed_at
        text commit
        list depends_on
        list acceptance_checks "machine-runnable commands"
        text notes "block scalar with mandatory sections"
    }

Required notes sections (per Task Authoring Standard)¶

Every task's notes block must include these sections, in order:

mindmap
  root((notes block))
    Context
      why the task exists
      what problem it solves
      links to source docs
    Pre-read required
      AGENTS.md
      Coding_Standards.md sections
      affected source files
      relevant ADRs
    Invariants must not break
      contracts that cannot change
      tests that must stay green
      security properties
      scoped boundaries
    Scope
      In: explicit list
      Out: explicit exclusions
    Acceptance evidence
      what proves done
      commands that verify

→ Source: Task_Authoring_Standard.md

Task ID conventions¶

flowchart LR
    ID[Task ID: X-AREA-FEATURE-NNN] --> X[X = lane]
    X --> XA[A — backend]
    X --> XB[B — UI]
    X --> XC[C — ops]

    ID --> AREA[AREA tag<br/>PROV, V3, IAM, BILL,<br/>APPS, SLICE, TERM, STORAGE,<br/>OPS, CLEAN, RUNBOOK, etc.]
    ID --> FEAT[Feature/component slug]
    ID --> NNN[3-digit serial]

Real examples already in the queue:

ID	Meaning
`A-PROV-SCHED-001`	A-backend, Provisioning scheduler, #1
`B-V3-ACK-SUPPRESS-CONTROL-001`	B-ui, V3 redesign Ack/Suppress, #1
`C-OPS-002`	C-ops, second item (PKI step-ca staging)
`A-CLEAN-001`	A-backend, cleanup sprint, #1

Task lifecycle¶

stateDiagram-v2
    [*] --> todo: queued by author
    todo --> in_progress: queue-claim by role
    in_progress --> blocked: depends_on unresolved<br/>or external waiting
    blocked --> in_progress: dependency clears
    in_progress --> done: acceptance checks pass<br/>+ commit recorded
    done --> [*]

    note right of in_progress
      Only one owner at a time
      Branch is agent/<lane>
      Status mutations are atomic
      via queue-claim / queue-set-status
    end note

    note right of done
      Cannot be done without:
      - acceptance_checks all green
      - completed_at + commit set
      - PR merged
    end note

    note right of blocked
      Coordinator can re-rank blocked tasks
      and surface unblocking dependencies
    end note

Validation gates¶

flowchart TB
    PR[PR modifying queue YAML]
    PR --> G1[agent_queue_validate.sh]
    G1 --> C1{Schema valid?}
    C1 -- no --> X1[Block PR]
    C1 -- yes --> C2{Task IDs unique?}
    C2 -- no --> X1
    C2 -- yes --> C3{Status in enum?}
    C3 -- no --> X1
    C3 -- yes --> C4{depends_on non-circular?}
    C4 -- no --> X1
    C4 -- yes --> G2[agent_queue_git_consistency.sh]
    G2 --> C5{Queue ↔ git agree?<br/>branch exists,<br/>status matches commit state}
    C5 -- no --> X1
    C5 -- yes --> OK[Pass]

    classDef block fill:#f8d7da,stroke:#42101e
    classDef ok fill:#d1e7dd,stroke:#0a3622
    class X1 block
    class OK ok

Why the current model has limits¶

The structured-store doc lists the failure modes that motivate evolution:

mindmap
  root((Current YAML pain points))
    Hand-edit drift
      owner / completed_at / commit
      easily mis-edited
      validator catches most but not all
    Validator breakage
      from queue-format mistakes
      every fix is operator work
    Archive sweeps
      recurring operational burden
      done tasks pile up in 16k-line file
    Weak historical trace
      claims and blockers not preserved
      handoffs not auditable
    Rising merge friction
      multiple planning updates touch one YAML
      conflict prone

→ Source: Agent_Queue_Structured_Store_v1.md §1 Problem Statement

Designed evolution — split YAML + SQLite¶

The decision is to split definition from state:

flowchart LR
    classDef def fill:#e3f2fd,stroke:#1565c0
    classDef state fill:#fff3e0,stroke:#e65100
    classDef tool fill:#d1e7dd,stroke:#0a3622

    subgraph DEF[YAML — durable definition]
        Y[Agent_Work_Queue.yaml<br/>id, title, role, priority,<br/>depends_on, acceptance_checks,<br/>notes]:::def
    end
    subgraph STATE[SQLite — mutable execution state + history]
        S[task_state, task_history,<br/>claim, blocker, handoff,<br/>completed_at, commit]:::state
    end

    Y --> CMD[Operator commands unchanged:<br/>make queue-claim<br/>make queue-set-status<br/>make queue-validate]:::tool
    S --> CMD

    note[Operators see the same CLI.<br/>YAML is what; SQLite is where.]

Properties of the split model:

Property	YAML	SQLite
Holds	Task definitions and planning	Live execution state + history
Mutation frequency	Low (when work is planned)	High (every claim, status change)
Source of truth for	What the work is	Where the work stands
Format	Human-readable	SQL, can be exported
History	Git log on YAML	Append-only `task_history` table
Conflict surface	Planning PRs	None (single-writer per row)

This is designed but not yet implemented; today everything is in the YAML.

→ Source: Agent_Queue_Structured_Store_v1.md

Three-way reconciliation¶

The queue is reconciled against git and the execution ledger:

flowchart TB
    Q[Agent_Work_Queue.yaml<br/>406 tasks] --> R[Three-way reconciliation]
    G[Git history<br/>commits by branch] --> R
    EP[Execution_Progress.md<br/>commit-level ledger] --> R

    R --> RULE[All three must agree on:<br/>1. task.commit matches a real commit<br/>2. task.completed_at appears in ledger<br/>3. acceptance_checks ran green]

    R -.fail.-> ALERT[Inconsistency surfaces<br/>via make queue-git-check]

    classDef src fill:#fff3cd,stroke:#332701
    class Q,G,EP src

Archive sweeps¶

done tasks accumulate. Periodic archive sweeps move them to:

doc/governance/archive/Agent_Work_Queue_<date>.yaml

The archive is reference-only — never re-claim work from an archive file. The active queue stays manageable. The structured-store design moves history to SQLite to remove these sweeps as operational burden.

Queue ↔ phases ↔ trackers¶

flowchart LR
    Q[Agent Work Queue] --> P[Phases<br/>Implementation_Roadmap.md]
    Q --> T[Gap trackers<br/>App Platform, GPU Slice, V3 Migration]
    Q --> X[Execution Progress ledger]

    P -. task id maps to phase .-> Q
    T -. gap maps to one or more tasks .-> Q
    X -. commit and completed_at mirror .-> Q

    classDef k fill:#d1e7dd,stroke:#0a3622
    class P,T,X k

Operator playbook (current)¶

# 1. validate (always before claim)
make queue-validate

# 2. find eligible work for your role
make queue-list ROLE=A status=todo

# 3. claim a task
make queue-claim ID=A-PROV-SCHED-001 OWNER=A
# → status flips to in_progress, branch agent/A required

# 4. switch to lane worktree
cd .claude/worktrees/A-backend

# 5. do the work; commit normally

# 6. mark complete with commit reference
make queue-set-status ID=A-PROV-SCHED-001 \
                       STATUS=done \
                       COMMIT=$(git rev-parse HEAD)

# 7. verify queue ↔ git consistency
make queue-git-check

Where to look next¶

Active work queue (live status) — current counts + recent commits
Multi-agent orchestration — coordinator that consumes the queue
Lane worktrees — branch + worktree topology
Policy & enforcement — gates around queue mutations
Source: Agent_Work_Queue.yaml, Agent_Queue_Structured_Store_v1.md, Task_Authoring_Standard.md