Platform IAM Model v1¶

Purpose¶

Define the platform IAM model in terms of:

what the platform already implements,
what should be treated as the canonical long-term model,
what must still be built or modified to avoid hardening the wrong assumptions.

This document is intentionally not a Keycloak design doc. In GPUaaS, Keycloak is an authentication and federation component, not the authoritative product IAM model.

Core Design Rule¶

Model IAM in three dimensions:

resource hierarchy
subject model
scoped role bindings over capability bundles

Do not start from a large catalog of named roles. Default roles are product bundles built on top of capability families.

Boundary: What Keycloak Does vs What Platform IAM Owns¶

Keycloak in current GPUaaS¶

Keycloak is currently responsible for:

OIDC login and auth-code exchange
refresh-token exchange
logout / token revocation
JWT issuance for browser/API sessions
JWKS publication for token validation
identity federation entry point for OIDC/SAML-style flows

Keycloak is not the authoritative source for:

tenant/project membership
tenant/project/platform scoped role bindings
service-account ownership and platform authorization
scoped audit visibility
project/tenant governance semantics

The platform database is the product IAM authority.

Canonical Objects¶

1. Principals¶

Canonical product actors.

Principal types:

human
service_account
group

Future:

external group references
workload identities if they become first-class beyond service accounts

2. External Identity Bindings¶

Authentication anchors attached to principals.

Examples:

OIDC issuer + subject
local password credential
tenant federation provider binding

These are authn bindings, not authorization truth.

3. Memberships¶

Memberships place a principal into tenant/project scope.

Current conceptual model:

tenant membership
project membership

Memberships are the current authorization root for tenant/project access.

4. Role Bindings¶

Role bindings attach a subject to a role bundle at a scope.

Scope hierarchy:

platform
tenant
project

5. Role Bundles / Capability Sets¶

Roles should be treated as named bundles over capabilities, not the base model.

Examples of capability families:

iam.*
billing.*
ops.*
project.*
resource.*
audit.*

6. Invitations¶

Invitation flows should be first-class IAM objects, not implicit user creation side effects.

Examples:

tenant invite
project invite
tenant-admin invite

7. Integration References¶

External tenant-owned systems should appear as integrations, not as platform-owned user stores.

Examples:

tenant Kubernetes cluster integration
tenant database integration
tenant external IdP configuration

The platform may store integration metadata and delegated credential references, but it should not mirror the external system's full user/role model.

Resource Hierarchy¶

Canonical hierarchy:

platform
tenant
project

This hierarchy defines where authority is bound.

Important rule:

Hierarchy does not imply universal content visibility.

Example:

a tenant_admin may be allowed to create/delete projects
that does not automatically mean they can inspect all data/content inside every child project

Management rights and content visibility must remain separable.

Subject Model¶

Subjects are the principals or group-like identities that receive bindings.

Initial subject types:

user
service account
group

The platform should not assume a single global human username namespace as the main identity boundary.

Safer model:

immutable principal identity
tenant membership as the real product access boundary
project membership nested under tenant/project scope

Default Role Families¶

These are default product bundles, not the full permission grammar.

Platform scope¶

platform_admin
platform_ops
platform_viewer
platform_iam_admin
platform_billing_admin

Tenant scope¶

tenant_owner
tenant_admin
tenant_ops
tenant_viewer
tenant_iam_admin
tenant_billing_admin

Project scope¶

project_owner
project_admin
project_operator
project_member
project_viewer

Important rule:

These defaults should be built from capability bundles and kept small. The platform should not expose an AWS-style explosion of role labels as the primary mental model.

Capability Separation Rules¶

The model must support these separations:

read vs mutate
management rights vs content visibility
IAM authority vs billing authority vs ops authority
tenant-wide governance vs project-local authoring

Examples:

platform_ops may investigate incidents and use admin read surfaces without having full IAM mutation rights.
tenant_admin may manage tenant users and projects without automatically seeing all project content.
tenant_billing_admin may view or manage billing without holding general tenant IAM authority.
project_admin may manage project members and service accounts without tenant-level user governance.

External System Identity Boundary¶

For tenant-owned infra systems such as Kubernetes or databases:

if the external system has its own SSO or IAM model, that remains the tenant's responsibility
GPUaaS may store an integration reference or delegated credential/configuration if needed
GPUaaS should not try to become the canonical IAM model for tenant-owned external systems

So:

platform IAM owns platform principals, memberships, and platform-managed identities
tenant-owned infra IAM stays external

What Exists Today¶

Already implemented or partially implemented¶

users
stores product users
includes oidc_issuer and oidc_subject
still has transitional role and org_id fields
tenant_memberships
tenant-scoped membership baseline exists
project_memberships
project-scoped membership baseline exists
tenant_identity_providers
tenant OIDC/SAML provider config exists in schema
tenant_federation_domain_bindings
tenant federation domain binding exists in schema
auth_federation_states
provider/org-bound auth flow state exists
role_definitions
role_definition_versions
platform_role_bindings
tenant_role_bindings
project_role_bindings
the platform already has the skeleton for a richer scoped role-binding model
service_accounts
project-scoped service-account model exists today
tenant-owned shared runtimes still need a separate delegated machine-identity model
scoped access-credential model
useful as a related platform primitive, but separate from IAM role design

Existing documented direction¶

Relevant docs already move in this direction:

Current Gaps / Mismatches¶

1. Global username uniqueness is still too strong¶

Current schema:

users.username text not null unique
partial MVP constraint on tenant_memberships(user_id) also still enforces single-tenant active membership

This is too restrictive for the intended tenant-scoped identity model.

2. `users.role` is still transitional and too coarse¶

Current role field only supports:

user
admin

This is insufficient for:

platform read-only admin visibility
ops-only investigation
tenant IAM admin
tenant billing admin
project admin/operator separation

3. Role-binding model exists but is not authoritative¶

The schema supports richer role bindings, but most runtime behavior still depends on:

membership tables
coarse users.role
endpoint-specific assumptions

4. No first-class invitation model yet¶

IAM needs invitation and delegated onboarding as explicit objects/workflows.

5. No first-class group model yet¶

Groups are implied by future need but not implemented as platform IAM primitives.

6. No formal external identity binding object beyond current OIDC fields¶

Current oidc_issuer / oidc_subject fields work, but richer multi-provider identity binding will need a clearer model.

7. Audit visibility is not yet fully scope-aware¶

Scoped audit is required for:

platform admin
tenant admin
project admin
future cross-project sharing/grant visibility

What Must Be Built or Modified¶

Phase 1: clarify authority and remove bad assumptions¶

document that platform DB, not Keycloak, is IAM authority
make role-binding/capability model the target authority in docs
keep Keycloak as auth/federation component only

Phase 2: fix identity scope assumptions¶

remove or relax single-tenant active-membership constraint when multi-tenant user support is enabled
revisit global username uniqueness
separate principal identity from tenant membership more clearly in read/write paths

Phase 3: make scoped role bindings real¶

move runtime authorization toward role bindings + capability evaluation
reduce users.role to compatibility/read-model only
introduce default role bundles for:
platform
tenant
project

Phase 4: add missing IAM primitives¶

invitations
groups
richer external identity bindings
scoped audit presentation
cross-project sharing/grants

Design Constraints To Preserve¶

do not make Keycloak the canonical product user store
do not require global human-readable username uniqueness as the long-term tenant model
do not conflate admin page visibility with mutation authority
do not assume parent-scope admin implies child-scope content visibility
do not turn tenant-owned external infra IAM into platform-owned IAM

Recommended Next Docs¶

This document should be followed by:

cross-project access/sharing model
scoped audit model
IAM API/resource contract slices
UX IA alignment for platform/tenant/project admin modes
delegated shared-runtime operator authz model for tenant-owned shared app runtimes

Platform IAM Model v1¶

Purpose¶

Core Design Rule¶

Boundary: What Keycloak Does vs What Platform IAM Owns¶

Keycloak in current GPUaaS¶

Canonical Objects¶

1. Principals¶

2. External Identity Bindings¶

3. Memberships¶

4. Role Bindings¶

5. Role Bundles / Capability Sets¶

6. Invitations¶

7. Integration References¶

Resource Hierarchy¶

Subject Model¶

Default Role Families¶

Platform scope¶

Tenant scope¶

Project scope¶

Capability Separation Rules¶

External System Identity Boundary¶

What Exists Today¶

Already implemented or partially implemented¶

Existing documented direction¶

Current Gaps / Mismatches¶

1. Global username uniqueness is still too strong¶

2. users.role is still transitional and too coarse¶

3. Role-binding model exists but is not authoritative¶

4. No first-class invitation model yet¶

5. No first-class group model yet¶

6. No formal external identity binding object beyond current OIDC fields¶

7. Audit visibility is not yet fully scope-aware¶

What Must Be Built or Modified¶

Phase 1: clarify authority and remove bad assumptions¶

Phase 2: fix identity scope assumptions¶

Phase 3: make scoped role bindings real¶

Phase 4: add missing IAM primitives¶

Design Constraints To Preserve¶

Recommended Next Docs¶

2. `users.role` is still transitional and too coarse¶