Skip to content

Evidence-First Change Protocol

Status: - governance guidance - applies to both human contributors and coding agents

Goal

This project prefers evidence over intuition when changing behavior.

The purpose of this protocol is to:

  • reduce regressions from assumption-driven changes
  • make debugging and rollout decisions traceable
  • keep changes small enough to verify quickly
  • require explicit proof before marking work complete

This is a pragmatic protocol for GPUaaS. It is not intended to force a full-repo test run for every minor edit.

Core Rule

If you do not have evidence to justify a code or operational change, your next action must produce that evidence.

Examples of acceptable evidence:

  • an existing unit or integration test result
  • a focused build or typecheck result
  • a live API or UI read-model check
  • a rollout or runtime probe
  • a schema or contract inspection
  • a reproducible failing case and a verified post-fix result

Examples of weak evidence:

  • “this should work”
  • “the code looks right”
  • “it probably uses the same pattern as elsewhere”

Required Working Cycle

For any non-trivial change, follow this loop:

  1. Measure the current state.
  2. Identify the smallest relevant baseline.
  3. Prefer targeted scope over full-repo ritual.
  4. Record what currently passes, fails, or is missing.

  5. Define the smallest verifiable change.

  6. Keep the change boundary narrow.
  7. State what you expect to observe if the change is correct.

  8. Make one coherent change.

  9. Avoid mixing unrelated fixes in the same verification cycle.
  10. If multiple layers must change together, keep the set minimal and explicit.

  11. Verify against the same baseline.

  12. Re-run the same checks that established the baseline.
  13. Add one new check that proves the intended behavior changed.

  14. Compare before and after.

  15. Note what improved.
  16. Note what stayed unchanged.
  17. If anything regressed, stop and fix the regression before moving on.

Baseline Expectations

The baseline must match the risk of the change.

Examples:

  • contract or schema change:
  • relevant OpenAPI/SQL inspection
  • targeted service tests
  • integration path if persistence changed
  • authz or billing change:
  • targeted unit/integration coverage is required
  • explicit regression proof is required
  • UI-only change:
  • targeted web tests and/or page-level verification
  • MAAS or provisioning change:
  • targeted service tests
  • integration check where practical
  • live read-model or workflow verification before claiming success

Do not default to “run everything” unless the scope truly justifies it.

Prediction Requirement

Before running verification, state the expected outcome in concrete terms.

Good:

  • “the MAAS site policy response should now include fabric_mode
  • “the scheduler page should return 200 for a workspace owner”
  • “the decommission list should order by requested_at, not UUID”

Bad:

  • “this should be better”
  • “the fix should work now”

Failure Handling

Unexpected failures are evidence.

Required response:

  • record what was expected
  • record what actually happened
  • identify the owning layer if possible
  • either fix that layer or mark the work blocked on it

Do not explain away failing evidence because it contradicts a theory.

Completion Standard

Do not mark work complete unless:

  • the touched baseline checks still pass
  • the changed behavior has one direct proof
  • any residual risk or unverified area is explicitly stated

“Compiled” is not enough.

“Looks correct” is not enough.

“Passed the targeted checks and demonstrated the changed behavior” is the minimum bar.

Relationship To Other Governance Docs

Explicit Non-Goals

This protocol does not require:

  • running the full repo test suite for every tiny change
  • deliberate breakage as a ritual step
  • rewriting approach or architecture without evidence

The project goal is disciplined, incremental, verifiable change, not ceremony.