Platform IAM Model v1¶
Purpose¶
Define the platform IAM model in terms of:
- what the platform already implements,
- what should be treated as the canonical long-term model,
- what must still be built or modified to avoid hardening the wrong assumptions.
This document is intentionally not a Keycloak design doc. In GPUaaS, Keycloak is an authentication and federation component, not the authoritative product IAM model.
Core Design Rule¶
Model IAM in three dimensions:
- resource hierarchy
- subject model
- scoped role bindings over capability bundles
Do not start from a large catalog of named roles. Default roles are product bundles built on top of capability families.
Boundary: What Keycloak Does vs What Platform IAM Owns¶
Keycloak in current GPUaaS¶
Keycloak is currently responsible for:
- OIDC login and auth-code exchange
- refresh-token exchange
- logout / token revocation
- JWT issuance for browser/API sessions
- JWKS publication for token validation
- identity federation entry point for OIDC/SAML-style flows
Keycloak is not the authoritative source for:
- tenant/project membership
- tenant/project/platform scoped role bindings
- service-account ownership and platform authorization
- scoped audit visibility
- project/tenant governance semantics
The platform database is the product IAM authority.
Canonical Objects¶
1. Principals¶
Canonical product actors.
Principal types:
humanservice_accountgroup
Future:
- external group references
- workload identities if they become first-class beyond service accounts
2. External Identity Bindings¶
Authentication anchors attached to principals.
Examples:
- OIDC issuer + subject
- local password credential
- tenant federation provider binding
These are authn bindings, not authorization truth.
3. Memberships¶
Memberships place a principal into tenant/project scope.
Current conceptual model:
- tenant membership
- project membership
Memberships are the current authorization root for tenant/project access.
4. Role Bindings¶
Role bindings attach a subject to a role bundle at a scope.
Scope hierarchy:
platformtenantproject
5. Role Bundles / Capability Sets¶
Roles should be treated as named bundles over capabilities, not the base model.
Examples of capability families:
iam.*billing.*ops.*project.*resource.*audit.*
6. Invitations¶
Invitation flows should be first-class IAM objects, not implicit user creation side effects.
Examples:
- tenant invite
- project invite
- tenant-admin invite
7. Integration References¶
External tenant-owned systems should appear as integrations, not as platform-owned user stores.
Examples:
- tenant Kubernetes cluster integration
- tenant database integration
- tenant external IdP configuration
The platform may store integration metadata and delegated credential references, but it should not mirror the external system's full user/role model.
Resource Hierarchy¶
Canonical hierarchy:
platformtenantproject
This hierarchy defines where authority is bound.
Important rule:
Hierarchy does not imply universal content visibility.
Example:
- a
tenant_adminmay be allowed to create/delete projects - that does not automatically mean they can inspect all data/content inside every child project
Management rights and content visibility must remain separable.
Subject Model¶
Subjects are the principals or group-like identities that receive bindings.
Initial subject types:
- user
- service account
- group
The platform should not assume a single global human username namespace as the main identity boundary.
Safer model:
- immutable principal identity
- tenant membership as the real product access boundary
- project membership nested under tenant/project scope
Default Role Families¶
These are default product bundles, not the full permission grammar.
Platform scope¶
platform_adminplatform_opsplatform_viewerplatform_iam_adminplatform_billing_admin
Tenant scope¶
tenant_ownertenant_admintenant_opstenant_viewertenant_iam_admintenant_billing_admin
Project scope¶
project_ownerproject_adminproject_operatorproject_memberproject_viewer
Important rule:
These defaults should be built from capability bundles and kept small. The platform should not expose an AWS-style explosion of role labels as the primary mental model.
Capability Separation Rules¶
The model must support these separations:
- read vs mutate
- management rights vs content visibility
- IAM authority vs billing authority vs ops authority
- tenant-wide governance vs project-local authoring
Examples:
platform_opsmay investigate incidents and use admin read surfaces without having full IAM mutation rights.tenant_adminmay manage tenant users and projects without automatically seeing all project content.tenant_billing_adminmay view or manage billing without holding general tenant IAM authority.project_adminmay manage project members and service accounts without tenant-level user governance.
External System Identity Boundary¶
For tenant-owned infra systems such as Kubernetes or databases:
- if the external system has its own SSO or IAM model, that remains the tenant's responsibility
- GPUaaS may store an integration reference or delegated credential/configuration if needed
- GPUaaS should not try to become the canonical IAM model for tenant-owned external systems
So:
- platform IAM owns platform principals, memberships, and platform-managed identities
- tenant-owned infra IAM stays external
What Exists Today¶
Already implemented or partially implemented¶
users- stores product users
- includes
oidc_issuerandoidc_subject -
still has transitional
roleandorg_idfields -
tenant_memberships -
tenant-scoped membership baseline exists
-
project_memberships -
project-scoped membership baseline exists
-
tenant_identity_providers -
tenant OIDC/SAML provider config exists in schema
-
tenant_federation_domain_bindings -
tenant federation domain binding exists in schema
-
auth_federation_states -
provider/org-bound auth flow state exists
-
role_definitions role_definition_versionsplatform_role_bindingstenant_role_bindingsproject_role_bindings-
the platform already has the skeleton for a richer scoped role-binding model
-
service_accounts - project-scoped service-account model exists today
-
tenant-owned shared runtimes still need a separate delegated machine-identity model
-
scoped access-credential model
- useful as a related platform primitive, but separate from IAM role design
Existing documented direction¶
Relevant docs already move in this direction:
- Role_and_Policy_Lifecycle_Model.md
- User_Onboarding_Model.md
- ADR-008-tenant-project-ownership-baseline.md
- ADR-010-tenant-federation-sso-model.md
Current Gaps / Mismatches¶
1. Global username uniqueness is still too strong¶
Current schema:
users.username text not null unique- partial MVP constraint on
tenant_memberships(user_id)also still enforces single-tenant active membership
This is too restrictive for the intended tenant-scoped identity model.
2. users.role is still transitional and too coarse¶
Current role field only supports:
useradmin
This is insufficient for:
- platform read-only admin visibility
- ops-only investigation
- tenant IAM admin
- tenant billing admin
- project admin/operator separation
3. Role-binding model exists but is not authoritative¶
The schema supports richer role bindings, but most runtime behavior still depends on:
- membership tables
- coarse
users.role - endpoint-specific assumptions
4. No first-class invitation model yet¶
IAM needs invitation and delegated onboarding as explicit objects/workflows.
5. No first-class group model yet¶
Groups are implied by future need but not implemented as platform IAM primitives.
6. No formal external identity binding object beyond current OIDC fields¶
Current oidc_issuer / oidc_subject fields work, but richer multi-provider identity binding will need a clearer model.
7. Audit visibility is not yet fully scope-aware¶
Scoped audit is required for:
- platform admin
- tenant admin
- project admin
- future cross-project sharing/grant visibility
What Must Be Built or Modified¶
Phase 1: clarify authority and remove bad assumptions¶
- document that platform DB, not Keycloak, is IAM authority
- make role-binding/capability model the target authority in docs
- keep Keycloak as auth/federation component only
Phase 2: fix identity scope assumptions¶
- remove or relax single-tenant active-membership constraint when multi-tenant user support is enabled
- revisit global username uniqueness
- separate principal identity from tenant membership more clearly in read/write paths
Phase 3: make scoped role bindings real¶
- move runtime authorization toward role bindings + capability evaluation
- reduce
users.roleto compatibility/read-model only - introduce default role bundles for:
- platform
- tenant
- project
Phase 4: add missing IAM primitives¶
- invitations
- groups
- richer external identity bindings
- scoped audit presentation
- cross-project sharing/grants
Design Constraints To Preserve¶
- do not make Keycloak the canonical product user store
- do not require global human-readable username uniqueness as the long-term tenant model
- do not conflate admin page visibility with mutation authority
- do not assume parent-scope admin implies child-scope content visibility
- do not turn tenant-owned external infra IAM into platform-owned IAM
Recommended Next Docs¶
This document should be followed by:
- cross-project access/sharing model
- scoped audit model
- IAM API/resource contract slices
- UX IA alignment for platform/tenant/project admin modes
- delegated shared-runtime operator authz model for tenant-owned shared app runtimes