CLI — gpuaas¶
Implemented
cmd/gpuaas-cli/ (3,687 lines, 19 command groups) · doc/architecture/CLI_v2_Command_Matrix.md · doc/architecture/CLI_Browser_OIDC_PKCE_Login_v1.md
The gpuaas binary is the operator/user CLI. Same authentication path as the web (OIDC PKCE in the browser), full REST coverage, JSON or table output.
Install¶
# from the repo
make build-cli # outputs ./bin/gpuaas
# or build for a specific OS/arch
GOOS=linux GOARCH=amd64 make build-cli
GOOS=darwin GOARCH=arm64 make build-cli
The CLI is a single static binary. No runtime deps.
Authenticate¶
gpuaas auth login # browser OIDC PKCE (primary)
gpuaas auth login --provider huggingface # provider hint
gpuaas auth login --tenant-hint acme # tenant hint for federation
gpuaas auth login --identity-hint user@acme.example
gpuaas auth login --no-browser # print URL, paste code
# dev-only password flows
gpuaas auth dev-login --username dev-user --password dev123
gpuaas auth keycloak-login --username dev-user --password dev123
# automation
gpuaas auth service-account-token \
--service-account-id ... --key-id ... --client-secret ...
Browser-OIDC PKCE flow (detail)¶
sequenceDiagram
autonumber
participant CLI as gpuaas CLI
participant LO as localhost:<ephemeral>/callback
participant BR as Browser
participant API as cmd/api
participant IDP as IdP via Keycloak
CLI->>CLI: generate PKCE verifier + S256 challenge
CLI->>LO: start ephemeral http listener
CLI->>API: GET /api/v1/auth/oidc/authorize<br/>?redirect_uri&code_challenge&method=S256&provider_hint&tenant_hint
API-->>CLI: {authorize_url, opaque_state}
CLI->>BR: open authorize_url
BR->>IDP: user completes IdP login
IDP-->>BR: redirect to localhost callback with code + state
BR->>LO: GET /callback?code&state
LO->>CLI: capture code, validate state
CLI->>API: POST /api/v1/auth/oidc/exchange<br/>{code, verifier, redirect_uri}
API->>API: verify PKCE, exchange with IdP
API-->>CLI: access_token + refresh_token + exp
CLI->>CLI: persist credentials in ~/.gpuaas/credentials.json
CLI->>LO: shutdown ephemeral listener
The localhost listener uses an ephemeral port and shuts down immediately after the callback. PKCE means the code is useless without the verifier the CLI holds in memory.
→ Source: CLI_Browser_OIDC_PKCE_Login_v1.md
Command groups (19 total)¶
mindmap
root((gpuaas))
auth
login
dev-login
keycloak-login
service-account-token
whoami
logout
catalog
list
get
allocations
list
get
create
release
connect
apps
catalog
instances
artifacts
entitlements
shared-runtimes
iam
tenants
projects
memberships
service-accounts
storage
list
upload
download
mkdir
rename
delete
billing
balance
usage
sessions
refunds
nodes
list
get
ops
overview
runbooks
projects
list
create
memberships
context
use
show
Full command matrix:
| Group | Source file | Highlights |
|---|---|---|
auth |
cmd_auth.go |
OIDC login, dev-login, service-account-token, whoami, logout |
allocations |
cmd_allocations.go |
list / get / create / release / connect |
apps |
cmd_apps*.go |
catalog, instances, artifacts, entitlements, shared-runtimes |
billing |
cmd_billing.go |
balance, usage, payment sessions, refunds |
catalog |
cmd_catalog.go |
SKU listing |
context |
cmd_context.go |
switch tenant/project context |
iam |
cmd_iam.go |
tenant/project membership ops |
introspection |
cmd_introspection.go |
dump effective scope/role |
nodes |
cmd_nodes.go |
(admin) node list/inspect |
ops |
cmd_ops.go |
(admin) ops overview + runbooks |
projects |
cmd_projects.go |
create/list/memberships |
service_accounts |
cmd_service_accounts.go |
create/list/rotate |
storage |
cmd_storage.go |
object storage CRUD |
→ Source: CLI_v2_Command_Matrix.md
Typical workflows¶
Provision and connect (most common path)¶
# 1. log in (opens browser)
gpuaas auth login
# 2. browse SKUs
gpuaas catalog list
# 3. create a 1-GPU H200 slice allocation
gpuaas allocations create \
--sku h200-sxm-slice \
--gpus 1 \
--region us-buffalo-1 \
--ssh-key-id 7f2e... \
--idempotency-key my-run-001
# 4. wait until active
gpuaas allocations get <allocation_id> --watch
# 5. open browser terminal (mints token, opens default browser)
gpuaas allocations connect <allocation_id> --mode terminal
# 6. or get SSH command (uses your registered public key)
gpuaas allocations connect <allocation_id> --mode ssh
# 7. release when done
gpuaas allocations release <allocation_id>
Output formats¶
gpuaas allocations list # human-readable table
gpuaas allocations list --output json # machine-readable JSON
gpuaas allocations get <id> --output yaml
Context switching¶
flowchart LR
A[gpuaas auth login] --> B[default tenant + project from claims]
B --> C[gpuaas context use --tenant acme --project ml-research]
C --> D[subsequent commands scoped to acme/ml-research]
D --> E[gpuaas context show]
E --> F["current: tenant=acme project=ml-research<br/>token expires in 14m"]
Context is stored in ~/.gpuaas/context.json. Switching context is local — no API call until a command runs.
How the CLI talks to the API¶
sequenceDiagram
autonumber
participant CLI
participant CF as ~/.gpuaas/credentials.json
participant API
participant REF as token refresh
CLI->>CF: load access_token + refresh_token + exp
alt token expiring within 60s
CLI->>REF: POST /api/v1/auth/token/refresh
REF-->>CLI: new access_token + refresh_token (rotated)
CLI->>CF: persist
end
CLI->>API: Authorization: Bearer <token><br/>X-Idempotency-Key (for mutations)
API->>API: verify + execute
API-->>CLI: response with correlation_id
CLI->>CLI: render output (table | json | yaml)
Idempotency: mutation commands (allocations create, refunds create, payments checkout) accept --idempotency-key. Reuse the same key to retry safely.
Agent-operable design¶
The CLI is designed so an agent (or operator) can drive it from a script. Specifically:
- Deterministic exit codes (
0success, non-zero = mapped error code) - Structured JSON output for every command
- Errors return the same envelope as the REST API (
{code, message, correlation_id, details}) on stderr - No interactive prompts unless
--interactive
→ Source: CLI_Agent_Operable_Control_Plane_v2.md
Runbook¶
| Issue | Runbook |
|---|---|
| Login redirect fails / browser doesn't open | CLI Incident and Support Triage |
| Refresh token rotation rejected | Same runbook §"Refresh anomalies" |
allocations connect terminal hangs |
Terminal Gateway Incident |
Where to look next¶
- Python SDK — same functionality from Python code
- Direct REST API — when you need surfaces the CLI doesn't expose yet
- End-to-end quick start — CLI + curl combined
- Error codes — every code the CLI may surface