Direct REST API¶
Contract
doc/api/openapi.draft.yaml (33,132 lines) · doc/api/asyncapi.draft.yaml (2,296 lines) · packages/shared/errors
When you don't have a language SDK, talk to the REST API directly. Same authentication model, same error envelope. This page shows the minimum patterns.
Authentication overview¶
flowchart TB
Q{Auth mode}
Q --> A1[Human user — browser OIDC PKCE]
Q --> A2[Dev / local — password grant]
Q --> A3[Automation — service account signed assertion]
A1 --> P1[Best for users + their own scripts]
A2 --> P2[Local dev only; never production]
A3 --> P3[CI / agents / cron]
classDef ok fill:#d1e7dd,stroke:#0a3622
class P1,P2,P3 ok
Dev login (fastest for testing locally)¶
TOKEN=$(curl -s -X POST http://localhost:8080/realms/gpuaas/protocol/openid-connect/token \
-d "grant_type=password" \
-d "client_id=gpuaas-api" \
-d "client_secret=dev-client-secret" \
-d "username=dev-user" \
-d "password=dev123" | jq -r .access_token)
curl -sH "Authorization: Bearer $TOKEN" http://localhost:8443/api/v1/me | jq .
For dev users: dev-user / dev123 (role: user) · dev-admin / admin123 (role: user + admin).
Browser OIDC PKCE (production human flow)¶
sequenceDiagram
autonumber
participant C as Your client
participant LH as localhost callback
participant BR as Browser
participant API as cmd/api
participant IDP as IdP
C->>C: generate verifier + S256 challenge
C->>LH: start ephemeral callback listener
C->>API: GET /auth/oidc/authorize?redirect_uri&code_challenge&method=S256
API-->>C: {authorize_url, state}
C->>BR: open authorize_url
BR->>IDP: user logs in
IDP-->>BR: redirect to localhost with code+state
BR->>LH: GET /callback?code&state
LH->>C: capture code, validate state
C->>API: POST /auth/oidc/exchange {code, verifier, redirect_uri}
API-->>C: access_token + refresh_token + exp
C->>LH: stop listener
Endpoints: GET /api/v1/auth/oidc/authorize, POST /api/v1/auth/oidc/exchange.
Service-account flow (automation)¶
# 1. admin pre-provisions the SA and gives you (sa_id, key_id, signing_key.pem)
# 2. assert + exchange every TTL (default 900s)
ASSERTION=$(python3 -c "
import jwt, time
payload = {
'iss': 'sa-1234', 'sub': 'sa-1234',
'aud': 'gpuaas-api',
'iat': int(time.time()),
'exp': int(time.time()) + 60,
'kid': 'key-2024-01'
}
key = open('/etc/gpuaas/sa.key').read()
print(jwt.encode(payload, key, algorithm='RS256', headers={'kid': 'key-2024-01'}))
")
TOKEN=$(curl -s -X POST https://api.gpuaas.example.com/api/v1/auth/sa/token \
-H 'Content-Type: application/json' \
-d "{\"sa_id\":\"sa-1234\",\"assertion\":\"$ASSERTION\"}" \
| jq -r .access_token)
Request shape¶
# Standard authenticated GET
curl -s "https://api.gpuaas.example.com/api/v1/catalog" \
-H "Authorization: Bearer $TOKEN" \
-H "X-Correlation-Id: my-trace-001"
# Idempotent mutation
curl -s -X POST "https://api.gpuaas.example.com/api/v1/allocations" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "X-Idempotency-Key: my-run-001" \
-d '{
"sku": "h200-sxm-slice",
"gpus_total": 1,
"region_code": "us-buffalo-1",
"ssh_key_ids": ["7f2e..."]
}'
Required / important headers:
| Header | When | Why |
|---|---|---|
Authorization: Bearer <token> |
Every authenticated request | JWT verified against cached JWKS |
X-Correlation-Id |
Optional | Carries through audit + traces + logs; useful for debugging |
X-Idempotency-Key |
All mutations | Safe retries; exception: terminal token mint (single-use) |
Content-Type: application/json |
Mutations with body | Standard |
Stripe-Signature |
Stripe webhook | Verified on raw body before parse |
Error envelope¶
Every error response follows this shape:
{
"code": "<catalog_code>",
"message": "Human-readable text",
"correlation_id": "uuid-of-this-request",
"details": { /* required for validation_error */ }
}
flowchart LR
R[REST call] --> ERR{Status code}
ERR -- 200/201/202/204 --> OK[Success body]
ERR -- 400/401/403/404/409/429 --> CLIENT[Client error<br/>code from catalog]
ERR -- 500/502/503/504 --> SERVER[Server error<br/>code = internal_error<br/>or upstream_error]
CLIENT --> CAT[Match against<br/>doc/architecture/Error_Code_Catalog.md]
SERVER --> CAT
classDef ok fill:#d1e7dd,stroke:#0a3622
classDef warn fill:#fff3cd,stroke:#332701
classDef bad fill:#f8d7da,stroke:#42101e
class OK ok
class CLIENT warn
class SERVER bad
→ Full catalog: Error codes
Rate limit headers¶
429 responses include:
| Header | Meaning |
|---|---|
X-RateLimit-Limit |
Configured limit in this window |
X-RateLimit-Remaining |
Calls left in the current window |
Retry-After |
Seconds to wait before retrying |
while true; do
resp=$(curl -sw "\n%{http_code}" "$URL" -H "Authorization: Bearer $TOKEN")
body=$(echo "$resp" | head -n -1)
code=$(echo "$resp" | tail -n 1)
if [ "$code" = "429" ]; then
delay=$(curl -sI "$URL" -H "Authorization: Bearer $TOKEN" | awk '/^Retry-After:/{print $2}' | tr -d '\r')
sleep "${delay:-5}"
else
echo "$body"
break
fi
done
Provisioning flow (end-to-end via curl)¶
sequenceDiagram
autonumber
participant U as your script
participant API as cmd/api
U->>API: POST /api/v1/allocations<br/>{sku, gpus_total, region, ssh_key_ids}<br/>X-Idempotency-Key
API-->>U: 201 {id, status=requested}
loop poll
U->>API: GET /api/v1/allocations/{id}
API-->>U: {status, connection?}
end
Note over U,API: status=active → ready
U->>API: POST /api/v1/allocations/{id}/terminal-token
API-->>U: {token, ws_url, expires_in: 300}
U->>API: connect WS to ws_url<br/>Sec-WebSocket-Protocol: token
Note over U,API: terminal session
U->>API: POST /api/v1/allocations/{id}/release<br/>X-Idempotency-Key
API-->>U: 202 {status=releasing}
WebSocket terminal (browser-style)¶
The auth token rides in the WebSocket subprotocol, never in the URL:
const ws = new WebSocket(tokenResponse.ws_url, [tokenResponse.token]);
// ↑ second arg = Sec-WebSocket-Protocol — that's how auth is delivered
ws.onmessage = (e) => xterm.write(e.data);
xterm.onData((d) => ws.send(d));
Server-side rules:
- No
?token=in URL — would leak through logs/proxies. - Token is single-use, 300s TTL, deleted on first validation.
- Rate-limited per user via
rate_limit.terminal_token_requests_per_minute(default 10).
Stripe webhook (raw-body-first)¶
Receiving end (if you build something webhook-shaped):
# Pseudo-code for the inbound side of a Stripe-style webhook on YOUR system.
@app.route("/webhook", methods=["POST"])
def webhook():
raw_body = request.get_data() # capture BEFORE any JSON parse
signature = request.headers["Stripe-Signature"]
# verify on EXACT bytes
if not stripe.Webhook.verify_signature(raw_body, signature, secret):
return ("bad signature", 400)
event = json.loads(raw_body)
# dedupe by event["id"] before any side effect
...
This is what cmd/api's /api/v1/payments/webhook does — Coding_Standards.md §7.
NATS / event subscription (server-to-server)¶
If you operate a downstream system that needs platform events, subscribe via the AsyncAPI contract (provisioning.*, billing.*, payments.*):
flowchart LR
PROD[GPUaaS outbox-relay] --> NATS[(NATS JetStream)]
NATS --> YC[Your consumer]
YC --> YC2[Process envelope:<br/>event_id, event_type, occurred_at,<br/>version, correlation_id, payload]
YC2 --> YC3[Dedupe by event_id<br/>idempotent handler]
classDef plat fill:#e3f2fd,stroke:#1565c0
classDef yours fill:#e8f5e9,stroke:#2e7d32
class PROD,NATS plat
class YC,YC2,YC3 yours
→ Full subject + payload reference: NATS subjects. Live spec render: AsyncAPI explorer.
Live API explorer¶
The full REST API is rendered with Swagger UI at:
- REST API explorer — interactive
- Raw OpenAPI:
openapi.draft.yaml
Where to look next¶
- CLI — pre-built tool, same surface
- Python SDK — same operations, typed
- Error codes — every code your client may receive
- NATS subjects — for event-driven integrations
- Policy keys — the runtime knobs you'd encounter