Skip to content

CLI — gpuaas

Implemented

Source: cmd/gpuaas-cli/ (3,687 lines, 19 command groups) · doc/architecture/CLI_v2_Command_Matrix.md · doc/architecture/CLI_Browser_OIDC_PKCE_Login_v1.md

The gpuaas binary is the operator/user CLI. Same authentication path as the web (OIDC PKCE in the browser), full REST coverage, JSON or table output.

Install

# from the repo
make build-cli              # outputs ./bin/gpuaas

# or build for a specific OS/arch
GOOS=linux  GOARCH=amd64 make build-cli
GOOS=darwin GOARCH=arm64 make build-cli

The CLI is a single static binary. No runtime deps.

Authenticate

gpuaas auth login                                  # browser OIDC PKCE (primary)
gpuaas auth login --provider huggingface           # provider hint
gpuaas auth login --tenant-hint acme               # tenant hint for federation
gpuaas auth login --identity-hint user@acme.example
gpuaas auth login --no-browser                     # print URL, paste code

# dev-only password flows
gpuaas auth dev-login --username dev-user --password dev123
gpuaas auth keycloak-login --username dev-user --password dev123

# automation
gpuaas auth service-account-token \
    --service-account-id ... --key-id ... --client-secret ...

Browser-OIDC PKCE flow (detail)

sequenceDiagram
    autonumber
    participant CLI as gpuaas CLI
    participant LO as localhost:<ephemeral>/callback
    participant BR as Browser
    participant API as cmd/api
    participant IDP as IdP via Keycloak

    CLI->>CLI: generate PKCE verifier + S256 challenge
    CLI->>LO: start ephemeral http listener
    CLI->>API: GET /api/v1/auth/oidc/authorize<br/>?redirect_uri&code_challenge&method=S256&provider_hint&tenant_hint
    API-->>CLI: {authorize_url, opaque_state}
    CLI->>BR: open authorize_url
    BR->>IDP: user completes IdP login
    IDP-->>BR: redirect to localhost callback with code + state
    BR->>LO: GET /callback?code&state
    LO->>CLI: capture code, validate state
    CLI->>API: POST /api/v1/auth/oidc/exchange<br/>{code, verifier, redirect_uri}
    API->>API: verify PKCE, exchange with IdP
    API-->>CLI: access_token + refresh_token + exp
    CLI->>CLI: persist credentials in ~/.gpuaas/credentials.json
    CLI->>LO: shutdown ephemeral listener

The localhost listener uses an ephemeral port and shuts down immediately after the callback. PKCE means the code is useless without the verifier the CLI holds in memory.

→ Source: CLI_Browser_OIDC_PKCE_Login_v1.md

Command groups (19 total)

mindmap
  root((gpuaas))
    auth
      login
      dev-login
      keycloak-login
      service-account-token
      whoami
      logout
    catalog
      list
      get
    allocations
      list
      get
      create
      release
      connect
    apps
      catalog
      instances
      artifacts
      entitlements
      shared-runtimes
    iam
      tenants
      projects
      memberships
      service-accounts
    storage
      list
      upload
      download
      mkdir
      rename
      delete
    billing
      balance
      usage
      sessions
      refunds
    nodes
      list
      get
    ops
      overview
      runbooks
    projects
      list
      create
      memberships
    context
      use
      show

Full command matrix:

Group Source file Highlights
auth cmd_auth.go OIDC login, dev-login, service-account-token, whoami, logout
allocations cmd_allocations.go list / get / create / release / connect
apps cmd_apps*.go catalog, instances, artifacts, entitlements, shared-runtimes
billing cmd_billing.go balance, usage, payment sessions, refunds
catalog cmd_catalog.go SKU listing
context cmd_context.go switch tenant/project context
iam cmd_iam.go tenant/project membership ops
introspection cmd_introspection.go dump effective scope/role
nodes cmd_nodes.go (admin) node list/inspect
ops cmd_ops.go (admin) ops overview + runbooks
projects cmd_projects.go create/list/memberships
service_accounts cmd_service_accounts.go create/list/rotate
storage cmd_storage.go object storage CRUD

→ Source: CLI_v2_Command_Matrix.md

Typical workflows

Provision and connect (most common path)

# 1. log in (opens browser)
gpuaas auth login

# 2. browse SKUs
gpuaas catalog list

# 3. create a 1-GPU H200 slice allocation
gpuaas allocations create \
    --sku h200-sxm-slice \
    --gpus 1 \
    --region us-buffalo-1 \
    --ssh-key-id 7f2e... \
    --idempotency-key my-run-001

# 4. wait until active
gpuaas allocations get <allocation_id> --watch

# 5. open browser terminal (mints token, opens default browser)
gpuaas allocations connect <allocation_id> --mode terminal

# 6. or get SSH command (uses your registered public key)
gpuaas allocations connect <allocation_id> --mode ssh

# 7. release when done
gpuaas allocations release <allocation_id>

Output formats

gpuaas allocations list                  # human-readable table
gpuaas allocations list --output json    # machine-readable JSON
gpuaas allocations get <id> --output yaml

Context switching

flowchart LR
    A[gpuaas auth login] --> B[default tenant + project from claims]
    B --> C[gpuaas context use --tenant acme --project ml-research]
    C --> D[subsequent commands scoped to acme/ml-research]
    D --> E[gpuaas context show]
    E --> F["current: tenant=acme project=ml-research<br/>token expires in 14m"]

Context is stored in ~/.gpuaas/context.json. Switching context is local — no API call until a command runs.

How the CLI talks to the API

sequenceDiagram
    autonumber
    participant CLI
    participant CF as ~/.gpuaas/credentials.json
    participant API
    participant REF as token refresh

    CLI->>CF: load access_token + refresh_token + exp
    alt token expiring within 60s
        CLI->>REF: POST /api/v1/auth/token/refresh
        REF-->>CLI: new access_token + refresh_token (rotated)
        CLI->>CF: persist
    end
    CLI->>API: Authorization: Bearer <token><br/>X-Idempotency-Key (for mutations)
    API->>API: verify + execute
    API-->>CLI: response with correlation_id
    CLI->>CLI: render output (table | json | yaml)

Idempotency: mutation commands (allocations create, refunds create, payments checkout) accept --idempotency-key. Reuse the same key to retry safely.

Agent-operable design

The CLI is designed so an agent (or operator) can drive it from a script. Specifically:

  • Deterministic exit codes (0 success, non-zero = mapped error code)
  • Structured JSON output for every command
  • Errors return the same envelope as the REST API ({code, message, correlation_id, details}) on stderr
  • No interactive prompts unless --interactive

→ Source: CLI_Agent_Operable_Control_Plane_v2.md

Runbook

Issue Runbook
Login redirect fails / browser doesn't open CLI Incident and Support Triage
Refresh token rotation rejected Same runbook §"Refresh anomalies"
allocations connect terminal hangs Terminal Gateway Incident

Where to look next