App Developer Starter Pack v1¶
As of: March 30, 2026
Purpose¶
Provide one entrypoint for an app developer who wants to build against the GPUaaS App Platform without reverse-engineering the rest of the repo.
This document answers: 1. what to read first, 2. which APIs are the source of truth, 3. how IAM and resource hierarchy work, 4. what SDK and CLI support exists today, 5. what is implemented now versus still directional.
This is the shortest document set that should be handed to: - an internal app team, - an external platform-app team, - an agent building or operating an app on behalf of a team.
Start Here¶
Read these in order:
- API contract:
doc/api/openapi.draft.yamldoc/api/asyncapi.draft.yaml- build path:
doc/architecture/Build_an_App_for_GPUaaS_v1.md- external integration boundary:
doc/architecture/External_App_Team_Integration_Guide_v1.md- app control plane:
doc/architecture/App_Control_Plane_v1.md- app worker contract direction:
doc/architecture/App_Runtime_External_Worker_Contract_v1.md- quickstart:
doc/architecture/App_Platform_Quickstart_v1.md- UI integration:
doc/architecture/App_UI_Extension_Model_v1.md- manifest and version onboarding:
doc/architecture/App_Manifest_Registration_Guide_v1.md
If you are building a clustered or scheduler-style app, also read:
- doc/architecture/Example_App_Developer_Reference_Workflow_v1.md
- doc/architecture/Slurm_Tenant_Scope_Semantics_v1.md
- doc/architecture/App_Tenant_Shared_Attachment_Model_v1.md
What The Platform Gives You¶
GPUaaS gives an app developer: - identity and IAM - project and tenant resource hierarchy - app catalog and entitlement surfaces - app instance and shared runtime resource models - allocation and placement primitives - service-account and delegated machine identity - access-credential custody and delivery - app-managed bootstrap SSH trust reconcile for app-instance-bound node bootstrap - billing attribution primitives - audit and correlation surfaces - CLI and SDK clients over the same public API
The platform does not expect the app developer to:
- access the database directly
- patch platform-core code for every new runtime
- rely on undocumented routes
- build against internal Go package behavior as the contract
- require operators to edit authorized_keys manually as part of the normal app bootstrap path
Platform Mental Model¶
GPUaaS is a control plane for infrastructure and platform apps.
That means: - the platform owns identity, IAM, resource ownership, allocation lifecycle, billing attribution, audit, and common UX shells - the app developer owns runtime-specific controller logic, runtime-specific bootstrap, runtime-specific health, and runtime-specific operational behavior
The easiest way to think about GPUaaS is:
Infrastructure control plane
-> capacity, allocations, identity, billing, audit
App control plane
-> catalog, entitlements, app instances, shared runtimes, operations
App-owned runtime logic
-> install, configure, bootstrap, reconcile, recover, report
The app platform is not asking the app developer to invent tenancy, auth, billing, or secure credential delivery. It is asking the app developer to implement runtime intelligence on top of those primitives.
Resource Hierarchy¶
The core ownership hierarchy is:
Organization (tenant)
-> Project
-> Users and memberships
-> Service accounts
-> Project-owned app instances
For tenant-shared runtimes, the hierarchy extends to:
Organization (tenant)
-> Shared app runtime
-> Shared runtime attachments
-> Attached consumer projects
-> Shared workers
-> Shared worker operations
Read:
- doc/architecture/App_Control_Plane_v1.md
- doc/architecture/App_Tenant_Shared_Attachment_Model_v1.md
- doc/architecture/App_Tenant_Shared_Runtime_API_Direction_v1.md
Implementation Model¶
An app on GPUaaS is not just a UI card in a catalog.
The implementation model has four layers:
1. Catalog layer¶
Defines: - app slug - published versions - entitlement rules - optional version metadata and artifact references
2. Control-plane resource layer¶
Defines operator-facing resources such as: - project app instances - tenant-shared runtimes - attachments - workers - operations
These resources are what humans, agents, SDKs, and the platform shell interact with.
3. Worker/operator layer¶
This is app-owned runtime logic.
The worker/operator: - reads runtime state from public APIs - consumes placement and credentials - reconciles any app-owned bootstrap trust it needs onto the selected node user through the supported platform path - bootstraps or reconfigures the runtime - reports progress and failure back through public APIs
4. Runtime/data-plane layer¶
This is the actual software the app team cares about: - Slurm - Ray - MLflow - model gateways - other distributed runtimes
GPUaaS does not want to absorb that runtime-specific SME logic into the platform core unless it proves to be a reusable primitive.
Three-Axis Runtime Model¶
Every serious app team needs to understand these three fields because they describe how the runtime is actually deployed and governed.
1. operating_mode¶
This says who operates the service shape.
Current values:
- tenant_dedicated
- platform_managed
Meaning:
- tenant_dedicated: the runtime is tenant-owned or tenant-isolated
- platform_managed: future shared platform-operated service model
2. control_plane_scope¶
This says where the runtime control plane lives.
Current values:
- project
- tenant
- platform
Meaning:
- project: one project owns and operates its own runtime
- tenant: one tenant-owned runtime may serve multiple attached projects
- platform: future shared platform-operated runtime
3. tenant_boundary_mode¶
This says what isolation guarantee the runtime is expected to provide.
Current values are documented in the API/model docs and should be read back from the effective resource shape.
This field exists because "tenant scope" and "shared substrate" are not the same thing.
Product-Facing Placement And Ownership Modes¶
For practical conversation with app teams, these combinations are the useful shorthand:
Project-scoped mode¶
Usually means:
- operating_mode = tenant_dedicated
- control_plane_scope = project
Use when: - each project wants its own isolated runtime - cross-project sharing is not wanted - billing and ownership should stay simple
Tenant-owned shared mode¶
Usually means:
- operating_mode = tenant_dedicated
- control_plane_scope = tenant
Use when: - one tenant-owned control plane should serve multiple projects - sharing is explicit and policy-controlled - worker contribution or job submission may come from attached projects
Platform-managed shared mode¶
Usually means:
- operating_mode = platform_managed
- control_plane_scope = platform
Use when: - the runtime is eventually offered as a platform-operated shared service
This is modeled directionally but should not be treated as the default shipped path for new apps yet.
IAM And Machine Identity¶
App developers need two identity models today:
1. Project-scoped service account¶
Use for: - project-owned app instances - project-scoped automation - project-scoped access-credential delivery
Read:
- doc/architecture/Service_Account_Model.md
- doc/architecture/Role_and_Policy_Lifecycle_Model.md
2. Tenant-shared runtime operator identity¶
Use for: - tenant-owned shared runtimes - shared runtime read/report flows - shared worker and attachment reporting
Read:
- doc/architecture/Tenant_Scoped_App_Machine_Identity_v1.md
- doc/architecture/Shared_Runtime_Operator_Authz_Model_v1.md
API Surfaces App Developers Should Use¶
The source of truth is always:
- doc/api/openapi.draft.yaml
- doc/api/asyncapi.draft.yaml
The main API families relevant to app developers are:
Catalog and entitlement¶
GET /api/v1/apps/catalogGET /api/v1/apps/catalog/{app_slug}/versionsGET /api/v1/projects/{project_id}/apps/entitlementsPUT /api/v1/projects/{project_id}/apps/entitlements/{app_slug}
Project-owned app instances¶
GET /api/v1/projects/{project_id}/app-instancesPOST /api/v1/projects/{project_id}/app-instancesGET /api/v1/projects/{project_id}/app-instances/{app_instance_id}DELETE /api/v1/projects/{project_id}/app-instances/{app_instance_id}POST /api/v1/projects/{project_id}/app-instances/{app_instance_id}/upgradePOST /api/v1/projects/{project_id}/app-instances/{app_instance_id}/rollbackPOST /api/v1/projects/{project_id}/app-instances/{app_instance_id}/decommission
Generic clustered app member flows¶
GET /api/v1/projects/{project_id}/app-instances/{app_instance_id}/membersGET /api/v1/projects/{project_id}/app-instances/{app_instance_id}/members/{member_id}POST /api/v1/projects/{project_id}/app-instances/{app_instance_id}/member-operationsGET /api/v1/projects/{project_id}/app-instances/{app_instance_id}/member-operations/{operation_id}
Tenant-shared runtimes¶
GET /api/v1/orgs/{org_id}/shared-app-runtimesPOST /api/v1/orgs/{org_id}/shared-app-runtimesGET /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}DELETE /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}
Tenant-shared attachments¶
GET /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}/attachmentsPOST /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}/attachmentsGET /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}/attachments/{attachment_id}DELETE /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}/attachments/{attachment_id}
Tenant-shared workers and operations¶
GET /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}/workersGET /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}/workers/{worker_id}POST /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}/worker-operationsGET /api/v1/orgs/{org_id}/shared-app-runtimes/{shared_runtime_id}/worker-operations/{operation_id}
Project-contributed worker flow¶
POST /api/v1/projects/{project_id}/shared-runtime-attachments/{attachment_id}/worker-operations
Reporting and credential/placement support¶
- service-account token mint
- shared-runtime operator token mint
- allocation reads
- access-credential delivery
The exact request and response bodies live in OpenAPI and remain authoritative.
SDK And CLI Support¶
Current operator-facing tooling is real, not placeholder-only.
Go SDK¶
Use:
- pkg/sdk
Current relevant support includes: - shared runtimes - attachments - shared workers - shared worker operations
Example:
- pkg/sdk/shared_runtimes.go
Python SDK¶
Use:
- packages/python-sdk
Current relevant support includes: - catalog - allocations - billing - terminal token minting - shared runtimes - attachments - shared workers - shared worker operations
Read:
- packages/python-sdk/README.md
CLI¶
Use:
- cmd/gpuaas-cli
Current relevant support includes:
- gpuaas apps shared-runtimes ...
- gpuaas schema <resource>
- gpuaas explain <command>
- gpuaas mcp serve
Read:
- doc/architecture/CLI_Agent_Operable_Control_Plane_v2.md
- doc/architecture/CLI_PythonSDK_v1_Plan.md
Implemented Now vs Still Directional¶
Implemented now¶
- app catalog and entitlement APIs
- project app instance lifecycle APIs
- tenant-shared runtime, attachment, worker, and worker-operation APIs
- service-account auth for project automation
- shared-runtime operator auth for tenant-shared runtime automation
- allocation read APIs for placement and bootstrap
- access-credential delivery APIs
- app shell extension seam
- Go SDK and Python SDK coverage for shared runtimes
- CLI coverage for shared runtimes and introspection
Still directional or incomplete¶
- fully externalized app-worker delivery model
- manifest-based third-party app registration flow
- schema-backed app manifest validation and deploy-form generation as the primary onboarding path
- final external app-worker packaging story
- final public transport choice for app-worker delivery if NATS is exposed directly
Minimum Package To Hand To An App Team Today¶
If you had to hand an app team only one short package today, it should be:
doc/api/openapi.draft.yamldoc/api/asyncapi.draft.yamldoc/architecture/App_Developer_Starter_Pack_v1.mddoc/architecture/Build_an_App_for_GPUaaS_v1.mddoc/architecture/External_App_Team_Integration_Guide_v1.mddoc/architecture/App_Runtime_External_Worker_Contract_v1.mdpackages/python-sdk/README.mddoc/architecture/App_Manifest_Registration_Guide_v1.md
For UI-heavy apps, also include:
9. doc/architecture/App_UI_Extension_Model_v1.md
Dry-Run Questions This Package Should Answer¶
An app developer should be able to answer these from the docs:
- How do I authenticate my app worker?
- Which API family do I use for project-owned versus tenant-shared runtimes?
- How do I read placement and allocation data?
- How do I get bootstrap credentials securely?
- How do I report runtime status and operation outcomes?
- Which SDK or CLI surface already exists for these flows?
- Which parts are fully implemented versus still architectural direction?
If the package cannot answer one of those clearly, the docs are not ready yet.
Current Recommendation¶
Use this starter pack as the top-level handoff document for app-platform work until a richer external developer portal exists.
The next documentation step after this should be: - the canonical manifest schema and registration API/import contract, - once that contract is explicit enough to support real third-party onboarding.