Tech debt register¶
Designed
Source:
doc/governance/Fallback_Tech_Debt_Register.md
The platform tracks every runtime fallback that could mask defects or create security/operability risk. Each entry has an explicit retirement target.
Policy¶
flowchart LR
BUG[Defect or risk] --> FIX{Root-cause fix<br/>in owning layer<br/>possible NOW?}
FIX -- yes --> ROOT[Apply root-cause fix<br/>no debt incurred]
FIX -- no --> FALL[Explicit, bounded<br/>fallback]
FALL --> REG[Register in<br/>Fallback_Tech_Debt_Register.md]
REG --> PLAN[Retirement plan:<br/>owner + target phase or date]
PLAN --> CI[CI ensures fallback<br/>cannot expand silently]
classDef ok fill:#d1e7dd,stroke:#0a3622
classDef risk fill:#fff3cd,stroke:#332701
class ROOT ok
class FALL risk
Rules from the register:
- Root-cause fix in owning layer is mandatory.
- Any remaining fallback must be explicit, bounded, and tracked here with owner + target date/phase.
- No new fallbacks land without an entry.
Discovery + triage¶
The register is kept honest by a discovery command + triage rule:
flowchart TB
DISC["Discovery command<br/>rg 'fallback|legacy|noop' in packages cmd<br/>excluding *_test.go"]
DISC --> TRI{Triage}
TRI -- config-default --> T1[OK if fail-closed<br/>semantics remain]
TRI -- runtime-compat --> T2[Allowed if explicitly temporary<br/>and documented]
TRI -- risk --> T3[Fallback can hide failures<br/>or weaken security posture<br/>MUST be retired]
classDef ok fill:#d1e7dd,stroke:#0a3622
classDef temp fill:#fff3cd,stroke:#332701
classDef risk fill:#f8d7da,stroke:#42101e
class T1 ok
class T2 temp
class T3 risk
Active high-priority debt¶
flowchart TB
classDef risk fill:#f8d7da,stroke:#42101e
classDef compat fill:#fff3cd,stroke:#332701
D1[1. Terminal legacy SSH key-source<br/>compatibility chain]:::risk
D2[2. Provisioning worker lazy POSIX<br/>identity creation]:::compat
D3[3. API runbook catalog fallback bundle]:::compat
1. Terminal legacy SSH key-source compatibility chain¶
| Property | Value |
|---|---|
| Type | risk |
| Location | packages/services/terminal/proxy.go |
| Why risky | Multiple env fallback paths (TERMINAL_* → PROVISIONING_*) and legacy key loading increase misconfiguration surface |
| Target state | Single terminal credential source contract for the active mode only |
| Owner | Backend (A) |
| Target | Pre-MVP cleanup sprint (A-CLEAN-001 / follow-up) |
2. Provisioning worker lazy POSIX identity creation¶
| Property | Value |
|---|---|
| Type | runtime-compat |
| Location | packages/services/provisioning/worker/service.go |
| Why risky | Worker writes identity if onboarding path misses it. Useful as guardrail but can hide upstream onboarding regression |
| Target state | Auth onboarding is primary creator; worker guardrail retained with metric/alert |
| Owner | Backend (A) |
| Target | Keep as guardrail; add alert + runbook in ops hardening |
3. API runbook catalog fallback bundle¶
| Property | Value |
|---|---|
| Type | runtime-compat |
| Location | cmd/api/main.go, cmd/api/admin_runbooks.go |
| Why risky | Fallback catalog can drift from real runbook set and hide config/package errors |
| Target state | Single source of truth for runbook catalog; no in-process fallback |
| Owner | Backend (A) |
| Target | Tracked as separate task; retirement when packaging path stabilises |
Debt-retirement lifecycle¶
stateDiagram-v2
[*] --> introduced: explicit fallback added in PR
introduced --> tracked: entry added to register
tracked --> hardening: owner adds alert/metric/runbook
hardening --> retiring: root-cause fix lands
retiring --> retired: fallback code removed
retired --> [*]: register entry archived
note right of tracked
Entry required:
type, location, why risky,
target state, owner, target phase
end note
note right of retired
Must verify:
- no production references
- no test references
- register entry moved to "retired" section
end note
Why fail-closed matters¶
flowchart LR
INCIDENT[Production incident<br/>upstream service unavailable] --> Q{Fallback semantics}
Q -- fail-open --> FO[Continue running with<br/>silently degraded behavior<br/>tenant sees inconsistent state]
Q -- fail-closed --> FC[Stop / 503 / reject<br/>operator sees the failure<br/>fixes upstream]
classDef bad fill:#f8d7da,stroke:#42101e
classDef good fill:#d1e7dd,stroke:#0a3622
class FO bad
class FC good
Every fallback in the register is required to be fail-closed. A fallback that lets bad data through (e.g. silently rounds money, accepts unsigned task params, skips audit) is rejected at review.
Adjacent disciplines¶
These standards work together with the debt register:
| Source | Rule that interacts with debt |
|---|---|
Coding_Standards.md §12 |
Root-cause-first remediation — no symptom-only fixes |
Coding_Standards.md §14 |
5xx classification — distinguish upstream vs local defect |
Testing_Standards.md §Evidence-First |
Every change has direct proof; previously-passing checks failing = regression |
| Assumptions register | Same precept: explicit + re-validation triggered |
Where to look next¶
- Assumptions register — the related "explicit non-defaults" tracker
- Coding patterns — root-cause-first remediation rule
- Threat model — security side of fail-closed
- Source:
Fallback_Tech_Debt_Register.md