Queue system¶
Implemented Designed
doc/governance/Agent_Work_Queue.yaml (16,178 lines, 406 tasks) · doc/governance/Agent_Queue_Structured_Store_v1.md · doc/governance/Task_Authoring_Standard.md · scripts/ci/agent_queue_validate.sh · scripts/ci/agent_queue_git_consistency.sh
The platform's work queue is a single YAML file today, with a designed evolution to a split YAML-plus-SQLite store. This page covers the file format, the task lifecycle, validation, and the evolution path.
Current state — single YAML¶
doc/governance/Agent_Work_Queue.yaml:
- 16,178 lines
- 406 tasks
- 3 states:
todo,in_progress,blocked,done - 3 roles:
A-backend,B-ui,C-ops(plus D-arch / E-governance review lanes) - Schema-validated by
scripts/ci/agent_queue_validate.sh - Git-consistency-checked by
scripts/ci/agent_queue_git_consistency.sh
File shape¶
erDiagram
queue ||--o{ task : "contains"
task ||--o{ acceptance_check : "must pass"
task ||--o{ depends_on : "blocked by"
queue {
int version
date updated_at
list states "todo, in_progress, blocked, done"
list roles "A-backend, B-ui, C-ops"
}
task {
text id PK "X-AREA-FEATURE-NNN"
text title
text role
int priority "1 highest"
text status
text owner
text branch "agent/<lane>"
date completed_at
text commit
list depends_on
list acceptance_checks "machine-runnable commands"
text notes "block scalar with mandatory sections"
}
Required notes sections (per Task Authoring Standard)¶
Every task's notes block must include these sections, in order:
mindmap
root((notes block))
Context
why the task exists
what problem it solves
links to source docs
Pre-read required
AGENTS.md
Coding_Standards.md sections
affected source files
relevant ADRs
Invariants must not break
contracts that cannot change
tests that must stay green
security properties
scoped boundaries
Scope
In: explicit list
Out: explicit exclusions
Acceptance evidence
what proves done
commands that verify
→ Source: Task_Authoring_Standard.md
Task ID conventions¶
flowchart LR
ID[Task ID: X-AREA-FEATURE-NNN] --> X[X = lane]
X --> XA[A — backend]
X --> XB[B — UI]
X --> XC[C — ops]
ID --> AREA[AREA tag<br/>PROV, V3, IAM, BILL,<br/>APPS, SLICE, TERM, STORAGE,<br/>OPS, CLEAN, RUNBOOK, etc.]
ID --> FEAT[Feature/component slug]
ID --> NNN[3-digit serial]
Real examples already in the queue:
| ID | Meaning |
|---|---|
A-PROV-SCHED-001 |
A-backend, Provisioning scheduler, #1 |
B-V3-ACK-SUPPRESS-CONTROL-001 |
B-ui, V3 redesign Ack/Suppress, #1 |
C-OPS-002 |
C-ops, second item (PKI step-ca staging) |
A-CLEAN-001 |
A-backend, cleanup sprint, #1 |
Task lifecycle¶
stateDiagram-v2
[*] --> todo: queued by author
todo --> in_progress: queue-claim by role
in_progress --> blocked: depends_on unresolved<br/>or external waiting
blocked --> in_progress: dependency clears
in_progress --> done: acceptance checks pass<br/>+ commit recorded
done --> [*]
note right of in_progress
Only one owner at a time
Branch is agent/<lane>
Status mutations are atomic
via queue-claim / queue-set-status
end note
note right of done
Cannot be done without:
- acceptance_checks all green
- completed_at + commit set
- PR merged
end note
note right of blocked
Coordinator can re-rank blocked tasks
and surface unblocking dependencies
end note
Validation gates¶
flowchart TB
PR[PR modifying queue YAML]
PR --> G1[agent_queue_validate.sh]
G1 --> C1{Schema valid?}
C1 -- no --> X1[Block PR]
C1 -- yes --> C2{Task IDs unique?}
C2 -- no --> X1
C2 -- yes --> C3{Status in enum?}
C3 -- no --> X1
C3 -- yes --> C4{depends_on non-circular?}
C4 -- no --> X1
C4 -- yes --> G2[agent_queue_git_consistency.sh]
G2 --> C5{Queue ↔ git agree?<br/>branch exists,<br/>status matches commit state}
C5 -- no --> X1
C5 -- yes --> OK[Pass]
classDef block fill:#f8d7da,stroke:#42101e
classDef ok fill:#d1e7dd,stroke:#0a3622
class X1 block
class OK ok
Why the current model has limits¶
The structured-store doc lists the failure modes that motivate evolution:
mindmap
root((Current YAML pain points))
Hand-edit drift
owner / completed_at / commit
easily mis-edited
validator catches most but not all
Validator breakage
from queue-format mistakes
every fix is operator work
Archive sweeps
recurring operational burden
done tasks pile up in 16k-line file
Weak historical trace
claims and blockers not preserved
handoffs not auditable
Rising merge friction
multiple planning updates touch one YAML
conflict prone
→ Source: Agent_Queue_Structured_Store_v1.md §1 Problem Statement
Designed evolution — split YAML + SQLite¶
The decision is to split definition from state:
flowchart LR
classDef def fill:#e3f2fd,stroke:#1565c0
classDef state fill:#fff3e0,stroke:#e65100
classDef tool fill:#d1e7dd,stroke:#0a3622
subgraph DEF[YAML — durable definition]
Y[Agent_Work_Queue.yaml<br/>id, title, role, priority,<br/>depends_on, acceptance_checks,<br/>notes]:::def
end
subgraph STATE[SQLite — mutable execution state + history]
S[task_state, task_history,<br/>claim, blocker, handoff,<br/>completed_at, commit]:::state
end
Y --> CMD[Operator commands unchanged:<br/>make queue-claim<br/>make queue-set-status<br/>make queue-validate]:::tool
S --> CMD
note[Operators see the same CLI.<br/>YAML is what; SQLite is where.]
Properties of the split model:
| Property | YAML | SQLite |
|---|---|---|
| Holds | Task definitions and planning | Live execution state + history |
| Mutation frequency | Low (when work is planned) | High (every claim, status change) |
| Source of truth for | What the work is | Where the work stands |
| Format | Human-readable | SQL, can be exported |
| History | Git log on YAML | Append-only task_history table |
| Conflict surface | Planning PRs | None (single-writer per row) |
This is designed but not yet implemented; today everything is in the YAML.
→ Source: Agent_Queue_Structured_Store_v1.md
Three-way reconciliation¶
The queue is reconciled against git and the execution ledger:
flowchart TB
Q[Agent_Work_Queue.yaml<br/>406 tasks] --> R[Three-way reconciliation]
G[Git history<br/>commits by branch] --> R
EP[Execution_Progress.md<br/>commit-level ledger] --> R
R --> RULE[All three must agree on:<br/>1. task.commit matches a real commit<br/>2. task.completed_at appears in ledger<br/>3. acceptance_checks ran green]
R -.fail.-> ALERT[Inconsistency surfaces<br/>via make queue-git-check]
classDef src fill:#fff3cd,stroke:#332701
class Q,G,EP src
Archive sweeps¶
done tasks accumulate. Periodic archive sweeps move them to:
The archive is reference-only — never re-claim work from an archive file. The active queue stays manageable. The structured-store design moves history to SQLite to remove these sweeps as operational burden.
Queue ↔ phases ↔ trackers¶
flowchart LR
Q[Agent Work Queue] --> P[Phases<br/>Implementation_Roadmap.md]
Q --> T[Gap trackers<br/>App Platform, GPU Slice, V3 Migration]
Q --> X[Execution Progress ledger]
P -. task id maps to phase .-> Q
T -. gap maps to one or more tasks .-> Q
X -. commit and completed_at mirror .-> Q
classDef k fill:#d1e7dd,stroke:#0a3622
class P,T,X k
Operator playbook (current)¶
# 1. validate (always before claim)
make queue-validate
# 2. find eligible work for your role
make queue-list ROLE=A status=todo
# 3. claim a task
make queue-claim ID=A-PROV-SCHED-001 OWNER=A
# → status flips to in_progress, branch agent/A required
# 4. switch to lane worktree
cd .claude/worktrees/A-backend
# 5. do the work; commit normally
# 6. mark complete with commit reference
make queue-set-status ID=A-PROV-SCHED-001 \
STATUS=done \
COMMIT=$(git rev-parse HEAD)
# 7. verify queue ↔ git consistency
make queue-git-check
Where to look next¶
- Active work queue (live status) — current counts + recent commits
- Multi-agent orchestration — coordinator that consumes the queue
- Lane worktrees — branch + worktree topology
- Policy & enforcement — gates around queue mutations
- Source:
Agent_Work_Queue.yaml,Agent_Queue_Structured_Store_v1.md,Task_Authoring_Standard.md