Scheduler as Platform App v1¶
Goal¶
Define a sustainable baseline for delivering scheduler capabilities (starting with Slurm) as platform apps, without coupling core allocation APIs to one scheduler implementation.
This document is the implementation baseline for: 1. Internal reference apps (for example Slurm). 2. Future first-party apps. 3. Third-party/team-owned apps built on the same control plane contracts.
Concrete first-adapter companion:
- doc/architecture/Slurm_App_Runtime_Adapter_v1.md
- doc/architecture/Clustered_App_Model_v1.md
- doc/architecture/App_Platform_Primitive_Boundary_v1.md
Decision Summary¶
Schedulersare modeled as apps in App Catalog and instantiated per project/tenant policy.- Core platform keeps scheduler-agnostic primitives only (identity, policy, audit, events, tenancy boundaries).
- Scheduler-specific logic (Slurm/K8s/Ray internals) lives in app operator runtime, not in core allocation handlers.
- Authorization remains permission-key based; role labels may evolve without handler rewrites.
- Initial production operating mode is
tenant_dedicated; shared managed scheduler offerings are explicitly a later mode.
For product language, the scheduler family should expose:
1. project-scoped mode
2. tenant-owned shared mode
3. later platform-managed shared mode
Current mapping:
1. project-scoped mode -> tenant_dedicated + project
2. tenant-owned shared mode -> target tenant_dedicated + tenant
3. platform-managed shared mode -> platform_managed + platform
Important limitation:
- the current app-instance contract is still project-owned
- so tenant-owned shared mode is a product target that still needs an
explicit attachment/ownership model
- see:
- doc/architecture/App_Tenant_Shared_Attachment_Model_v1.md
Scope¶
In scope: 1. Responsibility boundary. 2. IAM/policy requirements. 3. Artifact/source model (platform-shared + tenant-scoped). 4. Events and observability requirements. 5. Slurm pilot acceptance criteria and gap capture. 6. operating-mode expectations for scheduler backends.
Out of scope: 1. Slurm internals (controller tuning, partition strategy). 2. MaaS/hardware provisioning implementation details. 3. UI implementation details.
Core vs App Responsibilities¶
| Area | Core platform (must provide) | Scheduler app/operator (must provide) |
|---|---|---|
| Identity | User auth, service account auth, project context enforcement | None (consume core identities only) |
| IAM/Authz | Permission evaluation, role bindings, policy overlays, audit logs | Declare required actions and call only allowed endpoints |
| API contracts | Stable control-plane endpoints, canonical error envelope | Adapter/operator APIs behind app runtime boundary |
| Lifecycle | App instance lifecycle (requested -> running/failed) |
Scheduler deployment/upgrade/rollback mechanics |
| Events | Typed domain events + correlation propagation | Consume/emit app lifecycle events and runtime status |
| Tenancy | Project/tenant ownership and boundary checks | Never bypass project boundary; include context in all operations |
| Billing hooks | Usage attribution primitives by tenant/project | Scheduler usage metrics mapping (jobs/queues -> billable units) |
Rule: if functionality requires scheduler-specific branching inside core handlers, treat it as a platform defect and move it behind the app/operator boundary.
Required IAM Model¶
Use action keys, not role-name checks, in handlers.
Baseline action families¶
scheduler.catalog.readscheduler.instance.readscheduler.instance.createscheduler.instance.updatescheduler.instance.deletescheduler.queue.submitscheduler.queue.readscheduler.queue.cancelscheduler.node.readscheduler.node.operate(drain/cordon/uncordon/label)
Scope rules¶
- Tenant/project resources must enforce project ownership on every mutation.
- Service accounts are same-project only.
- Platform break-glass is allowed only on explicit admin endpoints and always audited.
- Role display labels (
project_member,project_admin, etc.) are UI concerns; permission keys are the enforcement contract.
Artifact and Registry Model¶
Both source tiers are first-class: 1. Platform-shared registries/artifact sources (blessed global sources). 2. Tenant-scoped allowlisted sources (private enterprise registries/buckets).
Policy behavior: 1. Global hard-deny is non-overridable. 2. Tenant/project overlays can narrow, never broaden, beyond global deny. 3. Scheduler app deployment must resolve artifact sources through policy evaluation, not hardcoded host lists.
Direction: 1. Keep API neutral for source type (OCI and non-OCI blob/object sources). 2. Credential delivery remains short-lived and task-scoped.
API Contract Direction¶
Scheduler app integration should use existing app-control-plane contracts: 1. Catalog/version publication for scheduler app entries. 2. Project entitlement enable/disable with policy overlays. 3. App instance create/read/delete for scheduler control-plane instances.
Required effective instance metadata:
1. operating_mode
2. control_plane_scope
3. runtime_backend
Allocation API remains scheduler-agnostic:
1. allocations.scheduler_type selects adapter path.
2. Scheduler references/metadata are stored as integration metadata.
3. Core allocation handlers do not embed Slurm-specific request semantics.
Clustered scheduler/operator apps must also follow the generic clustered-app model: 1. topology is app-level and tenant/project-admin controlled 2. physical node selection remains platform-owned 3. logical roles and mutable member lifecycle must not leak internal host-role assumptions into the public API
Event and Observability Contract¶
Minimum required event flow:
1. apps.instance.requested
2. apps.instance.running
3. apps.instance.failed
4. apps.instance.deleting
5. apps.instance.deleted
Every scheduler app operation must include:
1. correlation_id
2. org_id
3. project_id
4. app_slug
5. app_instance_id (where applicable)
Triage path:
1. API/UI error envelope -> correlation_id
2. Loki lookup by correlation_id
3. Tempo trace lookup by trace_id
4. Event timeline reconstruction from apps.instance.*
Slurm Pilot (Reference App)¶
Use Slurm as the first reference app to validate baseline completeness.
Initial operating-mode target:
1. tenant_dedicated
2. control_plane_scope = project | tenant depending on org policy and environment boundaries
3. project-owned app instances may attach to a project-scoped control plane for dev/test/stage/prod isolation or a tenant-scoped control plane for shared tenant schedulers
Pilot phases¶
- Register Slurm in app catalog + publish version.
- Enable entitlement for test project(s).
- Create Slurm app instance via app instance API.
- Validate scheduler queue operations through permissioned endpoints.
- Run upgrade/rollback and delete flows.
Lab baseline:
1. reference control-stack assets are deployed on dev-lab-1
2. worker-side join materials are deployed on dev-gpu-1
3. see doc/operations/Slurm_Reference_Lab_Stack.md
Required acceptance criteria¶
- No Slurm-specific branches in core allocation handlers.
- All privileged actions produce audit logs with
correlation_id. - Service account operator can manage only same-project scheduler instance.
- Policy overlays correctly restrict regions/SKUs/artifact sources.
- Full incident path is traceable across logs, traces, and events.
Gap log template (capture during pilot)¶
- Missing primitive in core.
- Leaky scheduler-specific coupling in core.
- Missing policy key or permission action.
- Missing event for operational triage.
- Missing billing attribution hook.
Baseline for Any Future App Team¶
An app is ready for onboarding only if all are true: 1. Uses app catalog + entitlement + app instance contracts (no hidden DB coupling). 2. Uses service accounts for operator automation. 3. Passes tenant/project boundary checks under negative tests. 4. Emits required lifecycle events with correlation context. 5. Supports policy-governed artifact sources. 6. Defines upgrade, rollback, and delete behavior. 7. Provides a support runbook with correlation-first triage steps.
Non-Negotiable Invariants¶
- Internal and external apps use the same contracts.
- No authz bypass for internal reference apps.
- No runtime hard dependency on one scheduler vendor in control-plane API semantics.
- No direct DB writes by app operators outside public contracts/events.
Related Docs¶
doc/architecture/App_Control_Plane_v1.mddoc/architecture/Clustered_App_Model_v1.mddoc/architecture/Service_Account_Model.mddoc/architecture/Role_and_Policy_Lifecycle_Model.mddoc/architecture/Allocation_Node_Placement_v1.mddoc/product/GPUaaS_vs_Armada_Bridge_Gap_Matrix.mddoc/architecture/App_Runtime_Operating_Modes_v1.mddoc/architecture/App_Tenant_Shared_Attachment_Model_v1.md