Skip to content

Gap Analysis: Prototype -> Production-Ready Platform

Status

  • Historical comparison artifact retained for context.
  • Many gaps listed here have been closed by the current implementation.
  • Use this file as a prototype-to-target delta reference, not as the live implementation status source.
  • For current state, use Execution_Progress.md, Implementation_Roadmap.md, and governance/Agent_Work_Queue.yaml.

A. Architecture and Boundaries

  • Current: single large backend module mixing auth, billing, provisioning, storage, Stripe, WS.
  • Gap: needs domain boundaries and clearer service/module split.

B. Data Durability and Consistency

  • Current: JSON file persistence with backup snapshots.
  • Gap: migrate to relational DB with transactions, constraints, indexes, and migrations.

C. Security

  • Current:
  • localStorage token storage
  • query token for SSH key download
  • plaintext credentials in env files
  • Gap:
  • secure session strategy
  • token scope and TTL tightening
  • KMS/secret manager and key rotation
  • audited admin/provisioning actions

D. Billing and Payments Integrity

  • Current:
  • periodic cost deduction mutates user balance directly
  • refund route is internal credit, not true Stripe refund
  • Gap:
  • immutable ledger-based accounting
  • reconciliation jobs
  • strict idempotency and compensation paths

E. Realtime and WS Contract

  • Current:
  • one WS endpoint used for terminal
  • frontend also tries to use same endpoint for global notifications without allocId
  • Gap:
  • separate channels/protocols for terminal and notifications
  • explicit schemas and heartbeat/reconnect strategy

F. Provisioning Reliability

  • Current:
  • DB updates and SSH actions can diverge on partial failure
  • Gap:
  • state machine + job queue + retries + compensation rules
  • idempotent provisioning commands

G. Observability and Operations

  • Current: console logging and /healthz only.
  • Gap:
  • structured logs, metrics, tracing
  • alerting and SLOs
  • audit event streams

H. UX and Product Completeness

  • Current:
  • scheduler and storage purchase flows are placeholders
  • long single component UI (App.jsx) with mixed concerns
  • Gap:
  • finalized UX specs and reusable UI architecture
  • end-to-end states and copy standards

I. Code-Verified Prototype Defects (Do Not Replicate)

  • Duplicate function definition:
  • releaseLinuxUserAccess is defined twice in backend/src/server.js (shadowing risk, migration noise).
  • Terminal privilege model coupling:
  • terminal access depends on shared node-admin SSH credentials and sudo -iu <user> behavior on nodes.
  • this creates operational fragility around credential rotation and sudo policy drift.
  • Sync file I/O data access pattern:
  • prototype getDb() reads JSON synchronously on request paths.
  • any equivalent full-state-per-request pattern is non-viable in production.
  • API payload contamination:
  • SKU API responses currently leak UI styling fields (color, bg), forcing weak schema contracts.
  • Naming/behavior mismatch:
  • prototype admin refund route performs internal balance credit, not provider-side Stripe refund.
  1. Data model + Postgres migration + ledger model.
  2. Provisioning/billing state machines and background workers.
  3. Auth/session/security hardening.
  4. WS split and realtime contract cleanup.
  5. Observability + audit logging.
  6. UX cleanup and production-ready interaction patterns.