Skip to content

App Runtime Billing Model v1

Goal

Define the minimum billing contract for platform apps so GPUaaS can meter tenant-dedicated and future platform-managed runtimes without rewriting billing ownership later.

This is a baseline attribution model, not the final pricing engine.

Core Rules

  1. Billing ownership remains anchored to tenant and project context.
  2. App runtime billing must not bypass the immutable ledger model.
  3. Billing contracts must work for both tenant_dedicated and platform_managed operating modes.
  4. Control-plane footprint and workload consumption must be explainable separately when needed.
  5. Internal reference apps and third-party apps use the same attribution model.

Attribution Anchors

Every billable app-runtime record must be attributable by: 1. org_id 2. project_id 3. app_instance_id 4. app_slug 5. operating_mode 6. control_plane_scope 7. runtime_backend 8. correlation_id

These fields are required even when the underlying runtime exports richer scheduler-specific signals.

Billing Shapes by Operating Mode

1. tenant_dedicated

Likely billable components: 1. dedicated control-plane footprint 2. tenant-bounded worker or compute capacity 3. runtime-specific usage signals such as jobs, pods, or serving uptime

Attribution behavior: 1. control-plane overhead may be billed to the owning project or distributed within the tenant by policy 2. worker usage remains project-attributed whenever the workload originated from a project-owned app instance 3. project-scoped control planes are the clean default for dev/test/stage/prod style environments

2. platform_managed

Likely billable components: 1. shared managed-service consumption 2. per-request, per-job, or per-runtime usage 3. service-tier or quota-based overhead allocation

Attribution behavior: 1. shared-service runtime cost must still resolve to project and tenant usage records 2. platform overhead allocation rules must be policy-driven, not embedded in app-specific code

Default Baseline Direction

Initial baseline should assume: 1. tenant_dedicated 2. control_plane_scope = project 3. project-owned app instances are the billing anchor for both runtime usage and any directly attached control-plane cost

This preserves clean environment-level attribution and avoids hidden cross-project subsidy in the initial model.

Tenant-scoped shared control planes are still supported, but cost-sharing must be explicit and policy-driven. See: - doc/architecture/App_Tenant_Shared_Attachment_Model_v1.md

Usage Record Direction

App runtime billing should eventually emit usage records that can be reconciled into ledger entries with: 1. usage_source = app_runtime 2. usage_unit appropriate to the runtime backend 3. app_instance_id 4. optional control_plane_component = true|false

Examples: 1. Slurm job runtime attributed to a project-owned app instance 2. model-serving uptime plus request volume 3. Ray cluster head/control overhead plus worker execution time

Separation of Concerns

Core platform is responsible for: 1. usage record schema and ledger integration 2. tenant/project ownership enforcement 3. auditability and reconciliation 4. policy-driven thresholds and entitlement limits

App operators are responsible for: 1. mapping runtime-native signals into the billing contract 2. identifying which runtime signals are billable 3. preserving correlation and project context in those signals

Policy Direction

Future policy overlays should be able to constrain: 1. whether control-plane overhead is billable 2. whether tenant-scoped shared-control costs may be distributed across projects 3. which runtime usage units are enabled for a given app 4. per-project cost ceilings or quotas for app instances

Reconciliation Requirements

Billing correctness for platform apps must support: 1. timeline reconstruction by correlation_id 2. separation of control-plane and workload consumption where applicable 3. deterministic attribution to tenant and project 4. audit-safe corrections through new ledger entries only

Examples

Slurm

  1. project-scoped Slurm control plane in dev
  2. control-plane cost attributed to that project
  3. job runtime attributed to same project
  4. tenant-scoped shared Slurm
  5. tenant-owned controller and tenant-reserved capacity are charged to the tenant-shared runtime owner record
  6. project-contributed worker capacity remains charged to the contributing source project
  7. submitted jobs should remain attributable to the submitting project when the scheduler/runtime emits that signal

First Tenant-Shared Billing Rule

For the first productized tenant-shared scheduler flow: 1. controller/control-plane allocations are billed to the tenant-owned shared runtime owner record 2. worker allocations contributed through an attached project remain billed to that source project 3. worker contribution must therefore preserve: - source_project_id - attachment_id - allocation_id 4. any later cross-project redistribution is a reporting/policy layer, not a rewrite of raw usage attribution

Model Serving

  1. tenant-dedicated private model serving
  2. serving instance uptime attributed to owning project
  3. request usage attributed to same project unless tenant policy says otherwise
  4. platform-managed inference tier
  5. request usage attributed to project
  6. shared platform overhead allocation defined by service tier policy

Non-Negotiable Invariants

  1. App billing must not invent a separate balance model.
  2. App billing must not require direct DB writes from app runtimes outside public contracts/events.
  3. Shared-service cost allocation must be explicit and explainable.
  4. Project remains the primary usage attribution anchor even when runtime control planes are tenant- or platform-scoped.
  1. doc/architecture/App_Control_Plane_v1.md
  2. doc/architecture/App_Runtime_Operating_Modes_v1.md
  3. doc/architecture/Scheduler_as_Platform_App_v1.md
  4. doc/architecture/State_Machines.md
  5. doc/architecture/App_Tenant_Shared_Attachment_Model_v1.md