Skip to content

Contract surface

Contract Implemented

Source: doc/api/openapi.draft.yaml (33,132 lines) · doc/api/asyncapi.draft.yaml (2,296 lines) · doc/architecture/db_schema_v1.sql (2,574 lines)

GPUaaS is contract-first. Every API or event change starts in the contract under doc/api/. Code follows the spec, never the reverse. This page surfaces the contract inventory at a glance.

Contract files

Surface Path Lines Authority
REST API doc/api/openapi.draft.yaml 33,132 Source of truth for all HTTP endpoints
Domain OpenAPI fragments doc/api/openapi/* Per-domain authoring manifests
AsyncAPI (events) doc/api/asyncapi.draft.yaml 2,296 Source of truth for NATS subjects + payloads
Domain AsyncAPI fragments doc/api/asyncapi/* Per-domain authoring manifests
Physical schema doc/architecture/db_schema_v1.sql 2,574 Source of truth for tables, indexes, constraints
API surface index doc/api/API_Surface.md Human-readable endpoint catalog

REST API — top-level groupings

mindmap
  root((REST surface))
    Public user
      ("auth/oidc/*")
      ("auth/personal/*")
      ("auth/token/refresh")
      catalog
      allocations
      ("allocations/{id}/terminal-token")
      ("allocations/{id}/release")
      ("billing/balance")
      ("billing/usage")
      ("billing/refunds")
      ("payments/checkout")
      ("storage/*")
      ("apps/instances")
      ("ssh-keys")
      projects
    Internal
      ("internal/v1/nodes/{id}/tasks/wait")
      ("internal/v1/nodes/{id}/tasks/{id}/result")
      ("internal/v1/guest-telemetry")
    Admin
      ("admin/users")
      ("admin/nodes")
      ("admin/nodes/{id}/slice-topology/discovery")
      ("admin/allocations")
      ("admin/allocations/{id}/force-release")
      ("admin/refunds")
      ("admin/payment-sessions")
      ("admin/audit-logs")
      ("admin/ops/overview")
      ("admin/runbooks")
      ("admin/os-images")
    Stripe
      ("payments/webhook")
    V3 read models
      ("v3/compute/*")
      ("v3/access/*")
      ("v3/platform/*")
      ("v3/workloads/*")

AsyncAPI — streams and subjects

NATS JetStream streams and the subjects they catch (InitStreams() in packages/shared/events):

flowchart LR
    subgraph S1[PROVISIONING stream]
      P1[provisioning.requested]
      P2[provisioning.active]
      P3[provisioning.failed]
      P4[provisioning.releasing.requested]
      P5[provisioning.releasing.completed]
      P6[provisioning.release_failed]
      P7[provisioning.force_release_requested]
    end

    subgraph S2[BILLING stream]
      B1[billing.low_balance_warning]
      B2[billing.auto_release_pending]
      B3[billing.balance_depleted]
    end

    subgraph S3[PAYMENTS stream]
      Y1[payments.balance_credited]
    end

    subgraph S4[DLQ stream]
      D1["dlq.*"]
    end

    classDef stream fill:#e3f2fd,stroke:#1565c0
    class S1,S2,S3,S4 stream
Stream Subjects pattern Primary producers Primary consumers
PROVISIONING provisioning.> orchestrator, provisioning-worker, billing-worker provisioning-worker, billing-worker, notification-relay
BILLING billing.> billing-worker notification-relay, billing-worker
PAYMENTS payments.> payments service / webhook-worker billing-worker, notification-relay
DLQ dlq.> (consumers on poison messages) ops alerts

Full per-subject consumer matrix: NATS subjects reference.

Envelope shape

Every event uses this envelope (defined in AsyncAPI):

{
  "event_id": "uuid",
  "event_type": "domain.name",
  "occurred_at": "RFC3339 timestamp",
  "version": "1.0",
  "correlation_id": "uuid",
  "payload": { /* type-specific */ }
}

Source: doc/api/asyncapi.draft.yaml #/components/messages/Envelope.

Error envelope

Every REST error response uses ErrorResponse from packages/shared/errors:

{
  "code": "<catalog_code>",
  "message": "human text",
  "correlation_id": "uuid",
  "details": { /* required for validation_error */ }
}

correlation_id is required in every error response. The full catalog of valid code values: Error codes reference.

Contract-first workflow

flowchart LR
    A[Open issue /<br/>need new endpoint or event] --> B[Edit openapi/<br/>asyncapi draft]
    B --> C[Run spectral lint<br/>+ contract breaking-change script]
    C --> D[Generate Go/TS types<br/>scripts/codegen.sh]
    D --> E[Write unit + handler tests<br/>before service code]
    E --> F[Implement handler<br/>+ service function]
    F --> G[Integration test<br/>against real Postgres / Redis / NATS]
    G --> H[CI gates<br/>contracts_validate, audit_presence,<br/>canonical_error, observability_trace]
    H --> I[Review + merge]

CI gates run before any contract change merges:

Script What it enforces
scripts/ci/contracts_validate.sh Spectral lint + structural validation of OpenAPI/AsyncAPI
scripts/ci/contracts_breaking_change.sh Diff against the previous spec — blocks unannotated breakage
scripts/ci/canonical_error_guard.sh Every error path uses a code from the catalog
scripts/ci/audit_mandatory_guard.sh Privileged mutation handlers write audit_logs
scripts/ci/audit_presence_guard.sh Audit rows present in integration tests
scripts/ci/observability_trace_gate.sh Tracing middleware is wired on every router

Code generation

scripts/codegen.sh produces: - Go: packages/shared/gen/openapigen/ — used only at HTTP boundaries (request decode / response encode). Internal services use hand-written domain models with explicit mapping. - TypeScript: packages/web/src/types/openapi.d.ts for the Next.js frontend. - Python SDK: sdk/python/ — generated client for the Python SDK.

Conventions

Rule Why
Bearer tokens never in query string Tokens in URL leak to logs/proxies
Browser WS auth uses Sec-WebSocket-Protocol No ?token= allowed; one-time exception is the documented approved transport
Mutations are idempotent via X-Idempotency-Key Safe retries on network failure
Stripe webhook: raw body first Signature verification needs exact bytes
Money in minor units (integer) + explicit currency No float drift
Outbox row in same DB tx as domain change Never lose an event; never emit one that didn't happen

Open questions / non-frozen areas

These are documented as DESIGNED and have not yet shipped into the public contract:

  • Multi-tenant org hierarchy UX surfaces — schema exists, contract pending
  • User-managed API keys for programmatic auth — deferred per PRD §6
  • Enterprise invoicing / subscriptions — design in Billing_Platform_Overhaul_v1.md
  • Resize-in-place for allocations — explicit non-goal for current pass

Where to look next