Python SDK — gpuaas-sdk¶
Implemented
Source:
sdk/python/ · package gpuaas_sdk · doc/architecture/CLI_PythonSDK_v1_Plan.md
Strict-typed (mypy --strict) Python client built on httpx. Same authentication paths as the CLI; same OpenAPI contract; same error envelope.
Install¶
Requires Python 3.10+. Single runtime dependency: httpx.
Module map¶
mindmap
root((gpuaas_sdk))
client.py
Client class
sync + async
retries
idempotency keys
pluggable auth
auth.py
OIDC PKCE
dev-login
service-account-token
refresh
catalog.py
list_skus
get_sku
allocations.py
list / get
create / release / connect
terminal_token
apps_catalog.py
list / get manifests
apps_instances.py
create / start / stop / release
members
events
apps_artifacts.py
list / promote
apps_entitlements.py
list / grant
billing.py
balance
usage
payment_sessions
refunds
storage.py
list / upload / download
rename / delete
mkdir
iam.py
tenants / projects
memberships
service_accounts.py
create / rotate / revoke
nodes.py
list / get (admin)
projects.py
list / create / memberships
ops.py
overview (admin)
errors.py
typed exceptions per error_code
Authentication¶
from gpuaas_sdk import Client
from gpuaas_sdk.auth import BrowserOIDCAuth, DevPasswordAuth, ServiceAccountAuth
# Browser PKCE (primary for humans)
client = Client(
base_url="https://api.gpuaas.example.com",
auth=BrowserOIDCAuth(tenant_hint="acme"),
)
# Dev-only password flow
client = Client(
base_url="http://localhost:8080",
auth=DevPasswordAuth(username="dev-user", password="dev123"),
)
# Automation: service-account signing key
client = Client(
base_url="https://api.gpuaas.example.com",
auth=ServiceAccountAuth(
sa_id="sa-1234",
key_id="key-2024-01",
signing_key_pem=open("/etc/gpuaas/sa.key").read(),
),
)
Auth handlers transparently:
- Refresh tokens within 60s of expiry
- Rotate refresh tokens
- Surface
token_expired/token_invalidas typed exceptions
Auth refresh flow¶
sequenceDiagram
autonumber
participant App as Your code
participant SDK as gpuaas_sdk.Client
participant AUTH as auth handler
participant API as cmd/api
App->>SDK: client.allocations.list()
SDK->>AUTH: get_access_token()
AUTH->>AUTH: check exp - now < 60s?
alt token near expiry
AUTH->>API: POST /auth/token/refresh
API-->>AUTH: new tokens (rotated refresh)
AUTH->>AUTH: persist
end
AUTH-->>SDK: access_token
SDK->>API: GET /allocations Authorization: Bearer
API-->>SDK: response
SDK-->>App: typed model: list[Allocation]
Quick examples¶
Provision and wait for active¶
from gpuaas_sdk import Client
client = Client.from_default_credentials()
alloc = client.allocations.create(
sku="h200-sxm-slice",
gpus_total=1,
region_code="us-buffalo-1",
ssh_key_ids=["7f2e..."],
idempotency_key="my-run-001",
)
print(f"allocation {alloc.id} status={alloc.status}")
# poll until active or terminal
final = client.allocations.wait_for(alloc.id, states={"active", "failed"}, timeout=600)
print(f"final state: {final.status}, private_ip={final.connection.private_ip}")
Mint a terminal token + connect¶
tok = client.allocations.mint_terminal_token(alloc.id)
print(f"ws_url={tok.ws_url} expires_in={tok.expires_in}s")
# Then open the WS yourself, or use the connect helper:
# client.allocations.connect(alloc.id, mode="terminal") # opens browser
Streaming usage and balance¶
balance = client.billing.balance()
print(f"balance: {balance.amount_minor / 100:.2f} {balance.currency}")
for record in client.billing.usage(since="2026-04-01T00:00:00Z"):
print(record.interval_start, record.allocation_id, record.cost_minor)
App-instance lifecycle¶
inst = client.apps.instances.create(
app_slug="jupyter-cuda-dev",
project_id=client.context.project_id,
target_allocation_id=alloc.id,
)
client.apps.instances.wait_for(inst.id, states={"running", "failed"})
# add a worker member (e.g. Slurm)
client.apps.instances.members.add(
instance_id=inst.id,
allocation_id=alloc2.id,
role="worker",
)
# stream events
for ev in client.apps.instances.events(inst.id, follow=True):
print(ev.event_type, ev.payload)
Storage CRUD¶
client.storage.mkdir(path="/datasets/imagenet/")
client.storage.upload(local_path="train.parquet", remote_path="/datasets/imagenet/train.parquet")
for obj in client.storage.list(prefix="/datasets/"):
print(obj.path, obj.size_bytes, obj.modified_at)
Error handling¶
The SDK maps the API error catalog to typed exceptions:
flowchart LR
API[API error_code] --> EX{Map to exception}
EX --> E1[TokenExpired<br/>TokenInvalid<br/>InsufficientPermissions]
EX --> E2[ValidationError<br/>InvalidRequest]
EX --> E3[AllocationNotFound<br/>AllocationNotActive<br/>SKUUnavailable]
EX --> E4[InsufficientBalance<br/>RefundWindowExceeded]
EX --> E5[RateLimitExceeded]
EX --> E6[UpstreamError<br/>ServiceUnavailable<br/>InternalError]
classDef tx fill:#f8d7da,stroke:#42101e
class E1,E2,E3,E4,E5,E6 tx
from gpuaas_sdk.errors import (
SDKError, TokenExpired, AllocationNotFound, RateLimitExceeded,
InsufficientBalance,
)
try:
client.allocations.release(alloc_id)
except AllocationNotFound:
print("already gone")
except InsufficientBalance as e:
print(f"balance too low: {e.message}")
except RateLimitExceeded as e:
print(f"slow down — retry after {e.retry_after}s")
except SDKError as e:
# last-resort fallback; includes e.code, e.correlation_id, e.details
raise
Every exception carries code, message, correlation_id, details — the full API error envelope.
Async usage¶
import asyncio
from gpuaas_sdk import AsyncClient
async def main():
async with AsyncClient.from_default_credentials() as client:
allocs = await client.allocations.list()
results = await asyncio.gather(*(client.allocations.get(a.id) for a in allocs))
for r in results:
print(r.id, r.status)
asyncio.run(main())
Same module shape; every call has an async def twin.
Testing your code¶
The SDK ships with respx fixtures for tests:
import pytest
from gpuaas_sdk.testing import mock_client
def test_provisioning(mock_client):
mock_client.allocations.fake_create("alloc-1", status="active")
result = mock_client.allocations.create(sku="h200-sxm-slice", gpus_total=1)
assert result.id == "alloc-1"
assert result.status == "active"
The SDK's own tests live under sdk/python/tests/ — test_allocations.py, test_apps.py, test_auth.py, test_billing.py, test_catalog.py, test_errors.py, test_iam.py, test_service_accounts.py, test_storage.py.
Where to look next¶
- CLI — same surface from the shell
- App SDK — for building apps that run on GPUaaS
- Direct REST API — for languages without an SDK
- End-to-end quick start — Python flow end-to-end
- Source:
CLI_PythonSDK_v1_Plan.md