Skip to content

GPUaaS Review Portal

A fact-based review surface for the Core42 GPUaaS platform. Curated for product, architecture, operations, security, and developer teams.

Every claim on this site is traceable to code (cmd/, packages/), contract (openapi.draft.yaml, asyncapi.draft.yaml), schema (db_schema_v1.sql), runbook (doc/operations/runbooks/), or RCA (doc/rca/). No roadmap, no aspirational content — only what exists today and what has been formally designed.

Pick your entry point

Status legend

Every detail page carries one or more of these badges so reviewers instantly know what level of evidence backs the page.

Implemented Backed by code in cmd/ or packages/. Callable today.

Contract Defined in openapi.draft.yaml or asyncapi.draft.yaml.

Designed A spec exists in doc/ but is not yet end-to-end in code.

Decided Decision recorded; implementation may be partial.

Runbook Has an operational runbook.

RCA Documented post-incident analysis.

Deprecated Being retired.

Portal map

flowchart TB
    classDef section fill:#e3f2fd,stroke:#1565c0,color:#0d47a1
    classDef trail   fill:#fff3e0,stroke:#e65100,color:#bf360c
    classDef ref     fill:#f3e5f5,stroke:#6a1b9a,color:#4a148c

    LP[Landing]
    LP --> NOW[What exists today]:::section
    LP --> ARCH[Architecture as-built]:::section
    LP --> PROD[Product]:::section
    LP --> OPS[Operations]:::section
    LP --> DEV[Developers]:::section
    LP --> SEC[Security & Governance]:::section

    LP --> TR[Cross-cutting trails]:::trail
    TR --> TR1[GPU Slice]:::trail
    TR --> TR2[App platform]:::trail
    TR --> TR3[Billing & payments]:::trail
    TR --> TR4[IAM & tenancy]:::trail
    TR --> TR5[Node & MAAS]:::trail
    TR --> TR6[Terminal & sessions]:::trail

    LP --> REF[Reference]:::ref
    REF --> R1[Glossary]:::ref
    REF --> R2[Error codes]:::ref
    REF --> R3[Policy keys]:::ref
    REF --> R4[NATS subjects]:::ref
    REF --> R5[REST API explorer]:::ref
    REF --> R6[AsyncAPI explorer]:::ref

What the platform is

Core42 GPUaaS is a contract-first GPU cloud control plane. Users discover GPU capacity, provision allocations (full bare-metal nodes or GPU slice VMs), open browser terminals or SSH into them, run platform apps (Jupyter, vLLM, Slurm, RKE2), and pay per usage. Admins manage inventory, audit, and refunds.

flowchart LR
    USER([End User]):::actor
    ADMIN([Admin]):::actor
    OPS_USER([Billing Operator]):::actor

    subgraph EDGE[Public Edge]
        WAF[WAF + API Gateway]
    end

    subgraph CP[Control Plane]
        BFF[cmd/api<br/>BFF + all REST routes]
        TG[cmd/terminal-gateway<br/>WS terminal]
        ORCH[Provisioning orchestrator]
        BILL[Billing + ledger]
        PAY[Payments / Stripe]
        AUTH[OIDC / Keycloak]
        ADM[Admin]
    end

    subgraph WK[Workers]
        PW[provisioning-worker]
        BW[billing-worker]
        WW[webhook-worker]
        NR[notification-relay]
        OR[outbox-relay]
        ARW[app-runtime-worker]
    end

    subgraph DATA[Data Plane]
        PG[(PostgreSQL)]
        REDIS[(Redis)]
        NATS[(NATS JetStream)]
        S3[(Object Storage)]
        VAULT[(Vault / KMS)]
    end

    subgraph FLEET[Fleet]
        NA[node-agent on each host]
        MAAS[MAAS bare-metal]
        STRIPE[Stripe]
    end

    USER --> WAF
    ADMIN --> WAF
    OPS_USER --> WAF
    WAF --> BFF
    WAF --> TG

    BFF --> AUTH
    BFF --> ORCH
    BFF --> BILL
    BFF --> PAY
    BFF --> ADM
    BFF --> PG
    BFF --> REDIS
    BFF --> S3

    ORCH --> PG
    ORCH --> NATS
    BILL --> PG
    PAY --> STRIPE

    PW <--> NATS
    BW <--> NATS
    WW <--> NATS
    OR --> NATS
    NR --> NATS
    ARW <--> NATS

    PW <--> NA
    TG <--> NA
    NA --> MAAS

    classDef actor fill:#fff8e1,stroke:#f57f17

See full system context → Browse what exists today →

Reading suggestions

Audience Suggested first pages
Product manager PRD distilledPersonas & journeysV3 redesign status
Architect System contextAllocation lifecycleGPU slice as-built
SRE / Ops Runbook indexObservability stackIncident severity model
Developer Quick startCoding patternsContract workflow
Security reviewer Threat modelSanitize-first rulesAudit & compliance
External reviewer What exists todayPosition vs other cloudsGPU slice trail