Status
- Historical comparison artifact retained for context.
- Many gaps listed here have been closed by the current implementation.
- Use this file as a prototype-to-target delta reference, not as the live implementation status source.
- For current state, use
Execution_Progress.md, Implementation_Roadmap.md, and governance/Agent_Work_Queue.yaml.
A. Architecture and Boundaries
- Current: single large backend module mixing auth, billing, provisioning, storage, Stripe, WS.
- Gap: needs domain boundaries and clearer service/module split.
B. Data Durability and Consistency
- Current: JSON file persistence with backup snapshots.
- Gap: migrate to relational DB with transactions, constraints, indexes, and migrations.
C. Security
- Current:
- localStorage token storage
- query token for SSH key download
- plaintext credentials in env files
- Gap:
- secure session strategy
- token scope and TTL tightening
- KMS/secret manager and key rotation
- audited admin/provisioning actions
D. Billing and Payments Integrity
- Current:
- periodic cost deduction mutates user balance directly
- refund route is internal credit, not true Stripe refund
- Gap:
- immutable ledger-based accounting
- reconciliation jobs
- strict idempotency and compensation paths
E. Realtime and WS Contract
- Current:
- one WS endpoint used for terminal
- frontend also tries to use same endpoint for global notifications without allocId
- Gap:
- separate channels/protocols for terminal and notifications
- explicit schemas and heartbeat/reconnect strategy
F. Provisioning Reliability
- Current:
- DB updates and SSH actions can diverge on partial failure
- Gap:
- state machine + job queue + retries + compensation rules
- idempotent provisioning commands
G. Observability and Operations
- Current: console logging and
/healthz only.
- Gap:
- structured logs, metrics, tracing
- alerting and SLOs
- audit event streams
H. UX and Product Completeness
- Current:
- scheduler and storage purchase flows are placeholders
- long single component UI (
App.jsx) with mixed concerns
- Gap:
- finalized UX specs and reusable UI architecture
- end-to-end states and copy standards
I. Code-Verified Prototype Defects (Do Not Replicate)
- Duplicate function definition:
releaseLinuxUserAccess is defined twice in backend/src/server.js (shadowing risk, migration noise).
- Terminal privilege model coupling:
- terminal access depends on shared node-admin SSH credentials and
sudo -iu <user> behavior on nodes.
- this creates operational fragility around credential rotation and sudo policy drift.
- Sync file I/O data access pattern:
- prototype
getDb() reads JSON synchronously on request paths.
- any equivalent full-state-per-request pattern is non-viable in production.
- API payload contamination:
- SKU API responses currently leak UI styling fields (
color, bg), forcing weak schema contracts.
- Naming/behavior mismatch:
- prototype admin
refund route performs internal balance credit, not provider-side Stripe refund.
Recommended Hardening Order
- Data model + Postgres migration + ledger model.
- Provisioning/billing state machines and background workers.
- Auth/session/security hardening.
- WS split and realtime contract cleanup.
- Observability + audit logging.
- UX cleanup and production-ready interaction patterns.