App Runtime Billing Model v1¶
Goal¶
Define the minimum billing contract for platform apps so GPUaaS can meter tenant-dedicated and future platform-managed runtimes without rewriting billing ownership later.
This is a baseline attribution model, not the final pricing engine.
Core Rules¶
- Billing ownership remains anchored to tenant and project context.
- App runtime billing must not bypass the immutable ledger model.
- Billing contracts must work for both
tenant_dedicatedandplatform_managedoperating modes. - Control-plane footprint and workload consumption must be explainable separately when needed.
- Internal reference apps and third-party apps use the same attribution model.
Attribution Anchors¶
Every billable app-runtime record must be attributable by:
1. org_id
2. project_id
3. app_instance_id
4. app_slug
5. operating_mode
6. control_plane_scope
7. runtime_backend
8. correlation_id
These fields are required even when the underlying runtime exports richer scheduler-specific signals.
Billing Shapes by Operating Mode¶
1. tenant_dedicated¶
Likely billable components: 1. dedicated control-plane footprint 2. tenant-bounded worker or compute capacity 3. runtime-specific usage signals such as jobs, pods, or serving uptime
Attribution behavior:
1. control-plane overhead may be billed to the owning project or distributed within the tenant by policy
2. worker usage remains project-attributed whenever the workload originated from a project-owned app instance
3. project-scoped control planes are the clean default for dev/test/stage/prod style environments
2. platform_managed¶
Likely billable components: 1. shared managed-service consumption 2. per-request, per-job, or per-runtime usage 3. service-tier or quota-based overhead allocation
Attribution behavior: 1. shared-service runtime cost must still resolve to project and tenant usage records 2. platform overhead allocation rules must be policy-driven, not embedded in app-specific code
Default Baseline Direction¶
Initial baseline should assume:
1. tenant_dedicated
2. control_plane_scope = project
3. project-owned app instances are the billing anchor for both runtime usage and any directly attached control-plane cost
This preserves clean environment-level attribution and avoids hidden cross-project subsidy in the initial model.
Tenant-scoped shared control planes are still supported, but cost-sharing must be explicit and policy-driven.
See:
- doc/architecture/App_Tenant_Shared_Attachment_Model_v1.md
Usage Record Direction¶
App runtime billing should eventually emit usage records that can be reconciled into ledger entries with:
1. usage_source = app_runtime
2. usage_unit appropriate to the runtime backend
3. app_instance_id
4. optional control_plane_component = true|false
Examples: 1. Slurm job runtime attributed to a project-owned app instance 2. model-serving uptime plus request volume 3. Ray cluster head/control overhead plus worker execution time
Separation of Concerns¶
Core platform is responsible for: 1. usage record schema and ledger integration 2. tenant/project ownership enforcement 3. auditability and reconciliation 4. policy-driven thresholds and entitlement limits
App operators are responsible for: 1. mapping runtime-native signals into the billing contract 2. identifying which runtime signals are billable 3. preserving correlation and project context in those signals
Policy Direction¶
Future policy overlays should be able to constrain: 1. whether control-plane overhead is billable 2. whether tenant-scoped shared-control costs may be distributed across projects 3. which runtime usage units are enabled for a given app 4. per-project cost ceilings or quotas for app instances
Reconciliation Requirements¶
Billing correctness for platform apps must support:
1. timeline reconstruction by correlation_id
2. separation of control-plane and workload consumption where applicable
3. deterministic attribution to tenant and project
4. audit-safe corrections through new ledger entries only
Examples¶
Slurm¶
- project-scoped Slurm control plane in
dev - control-plane cost attributed to that project
- job runtime attributed to same project
- tenant-scoped shared Slurm
- tenant-owned controller and tenant-reserved capacity are charged to the tenant-shared runtime owner record
- project-contributed worker capacity remains charged to the contributing source project
- submitted jobs should remain attributable to the submitting project when the scheduler/runtime emits that signal
First Tenant-Shared Billing Rule¶
For the first productized tenant-shared scheduler flow:
1. controller/control-plane allocations are billed to the tenant-owned shared runtime owner record
2. worker allocations contributed through an attached project remain billed to that source project
3. worker contribution must therefore preserve:
- source_project_id
- attachment_id
- allocation_id
4. any later cross-project redistribution is a reporting/policy layer, not a rewrite of raw usage attribution
Model Serving¶
- tenant-dedicated private model serving
- serving instance uptime attributed to owning project
- request usage attributed to same project unless tenant policy says otherwise
- platform-managed inference tier
- request usage attributed to project
- shared platform overhead allocation defined by service tier policy
Non-Negotiable Invariants¶
- App billing must not invent a separate balance model.
- App billing must not require direct DB writes from app runtimes outside public contracts/events.
- Shared-service cost allocation must be explicit and explainable.
- Project remains the primary usage attribution anchor even when runtime control planes are tenant- or platform-scoped.
Related Docs¶
doc/architecture/App_Control_Plane_v1.mddoc/architecture/App_Runtime_Operating_Modes_v1.mddoc/architecture/Scheduler_as_Platform_App_v1.mddoc/architecture/State_Machines.mddoc/architecture/App_Tenant_Shared_Attachment_Model_v1.md