Storage IAM User Flows v1¶

Date: 2026-04-27

Purpose¶

Explain how GPUaaS storage access works from the user, workload, CLI, and provider perspectives. This document answers the recurring questions:

Does every GPUaaS user become a WEKA user?
Does every project use one shared service account?
How does a direct S3 client get access?
How are shared datasets and artifacts used across projects?
Who enforces access: GPUaaS or WEKA?

Short answer:

GPUaaS decides access intent.
GPUaaS compiles provider policy.
WEKA issues/enforces scoped provider credentials.

Human users stay in GPUaaS IAM. WEKA identities are provider-derived enforcement principals, usually service accounts or STS sessions.

Core Vocabulary¶

Term	Meaning
GPUaaS user	Human identity authenticated by GPUaaS/OIDC.
GPUaaS service account	Project-scoped machine identity used by workloads, apps, and automation.
WEKA principal	Provider-side S3 user, service account, or STS session used for enforcement.
Storage grant	GPUaaS-owned truth that says a principal may access a bucket/prefix with specific actions.
Policy compiler	GPUaaS component that converts storage grants into WEKA IAM/bucket/session policy.
STS credentials	Temporary S3 credentials whose provider session carries or references a scoped policy.
Owning project	Project that owns quota, billing, lifecycle, deletion authority, and default policy for a bucket.
Shared-with project	Project that has explicit read/write access to someone else's bucket or prefix.

Authority Boundary¶

GPUaaS IAM
  - users
  - project membership
  - service accounts
  - storage grants
  - audit and policy intent

WEKA IAM / S3
  - provider principal/session
  - attached or session policy
  - S3 request enforcement
  - credential expiration

WEKA does not need to know GPUaaS users. It only needs a scoped provider principal or STS session with an enforceable policy.

Storage Hierarchy¶

Storage is project-owned by default:

tenant
  project
    bucket / namespace
      shared/
      users/<user-id>/
      workloads/<workload-id>/
      datasets/<dataset-id>/
      checkpoints/<workload-id>/
      artifacts/

Example:

Research/users/subash/
Research/users/priya/
Research/datasets/imagenet/
Research/artifacts/llama-3-70b/

A user has separate project-scoped personal areas:

Research/users/subash/
Sandbox/users/subash/

There is no global users/subash/projects/* ownership hierarchy.

Scenario 1: User Opens Storage In The UI¶

User logs into GPUaaS.
User selects tenant, project, and region.
UI calls v3 storage read models.
GPUaaS returns:
buckets owned by the active project
buckets shared with the active project
owner project
grants/audiences summary
attached workloads
provider capability hints
UI does not receive WEKA credentials.

Expected UI labels:

Owned by Research
Shared from Training
Read-only dataset
Writable checkpoint output
Provider: WEKA

Scenario 2: User Creates A Bucket In The UI¶

User opens Storage → New bucket.
User selects:
project
purpose: workspace, dataset, checkpoint, artifact, generic
quota
lifecycle
access defaults
GPUaaS checks the user's project role.
GPUaaS creates a project-owned bucket/namespace.
Provider adapter creates the WEKA-side bucket/filesystem/prefix as needed.
GPUaaS records storage bucket metadata and provider reference.
GPUaaS writes audit:
storage.bucket.create

No human WEKA user is created.

Scenario 3: User Shares A Dataset With Another Project¶

Example:

Training project owns: training:imagenet
Inference project needs read access.

Flow:

Training project admin opens bucket detail.
Admin adds grant:
target: Inference project
prefix: datasets/imagenet/
permissions: read/list
expiration: optional
GPUaaS writes a storage grant.
GPUaaS compiles WEKA policy updates for future provider credentials.
GPUaaS writes audit:
storage.grant.create

The shared project does not own the bucket. It only has access according to the grant.

Scenario 4: User Uses A Shared Bucket In A Launch Wizard¶

Example:

Inference project launches vLLM using model artifacts shared from Training.

Flow:

User opens app/compute launch wizard in the Inference project.
Storage picker shows:
Inference-owned writable buckets
Training-shared read-only artifacts/datasets
User selects training:artifacts/llama-3-70b/ as read-only input.
GPUaaS checks:
user can launch in Inference
Inference has read grant to the Training prefix
GPUaaS creates/selects a workload service account.
GPUaaS compiles WEKA policy:
read/list on shared Training artifact prefix
write only to Inference output/checkpoint prefix if requested
Provider credentials are delivered to the workload, not the human user.
Audit records both the user-caused launch and storage credential issuance.

Scenario 5: Workload Uses Storage¶

Workloads should not run with human credentials.

Flow:

User launches workload.
GPUaaS creates a workload-bound service account, for example:
sa_jupyter_wl_123
GPUaaS attaches explicit storage grants to that runtime:
Research/users/subash/wl_123/* read/write
Research/datasets/imagenet/* read-only
GPUaaS asks WEKA for provider credentials or creates a provider service account with that policy.
Node-agent or app controller receives only scoped mount/S3 instructions.
On workload release, GPUaaS revokes or disables provider access.

Important rule:

project-scoped service account = may exist in project
storage access = explicit bucket/prefix grants only

A project service account does not automatically see every user's data.

Scenario 6: User Uses GPUaaS CLI For Storage¶

Example command shape:

gpuaas storage ls --project Research research:users/subash/
gpuaas storage cp ./data.csv research:users/subash/data.csv

Flow:

CLI uses GPUaaS auth token.
CLI calls GPUaaS storage APIs.
GPUaaS checks user IAM and storage grants.
GPUaaS performs the operation through the storage service or returns a controlled transfer path.
User never handles WEKA credentials directly.

This is the safest UX for common users because GPUaaS mediates the operation.

Scenario 7: User Uses Any S3 Client¶

Users may need aws s3, boto3, PyTorch data loaders, or other S3-compatible clients.

Example request:

gpuaas storage credentials issue \
  --project Research \
  --bucket research \
  --prefix users/subash/ \
  --mode read-write \
  --ttl 1h

Flow:

User authenticates to GPUaaS.
User requests temporary S3 credentials for bucket/prefix.
GPUaaS checks:
project membership
storage grants
requested mode
TTL policy
GPUaaS compiles a scoped WEKA policy.
GPUaaS calls WEKA STS or provider API.
WEKA issues temporary credentials whose session carries or references the policy.
GPUaaS returns:
endpoint
access key
secret key
session token when applicable
expiration
allowed bucket/prefix summary
User configures any S3 client with those credentials.
WEKA enforces the scoped policy on every S3 request.

The user does not pass policy to the S3 client. The credentials are already bound to the provider session/policy.

Scenario 8: User Uses WEKA/S3 CLI Directly¶

The preferred path is still through GPUaaS-issued temporary credentials.

Example:

export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_SESSION_TOKEN=...

aws --endpoint-url https://s3.example.internal \
  s3 ls s3://research/users/subash/

WEKA sees only the temporary provider session. GPUaaS knows why it was issued because it records:

credential_issuance_id
user_id
project_id
bucket
prefixes
permissions
expires_at
provider_session_id
policy_hash
correlation_id

Scenario 9: Project Automation Uses Storage¶

Project automation should use GPUaaS service accounts, not human users.

Flow:

Project admin creates service account:
sa_training_pipeline
Admin grants:
read on datasets/imagenet/
write on checkpoints/pipeline/
Automation exchanges service-account credentials for a GPUaaS token.
Automation requests temporary provider credentials or uses GPUaaS storage APIs.
GPUaaS logs actor as service account.

Scenario 10: Grant Revocation¶

Example:

Training revokes Inference read access to training:imagenet.

Flow:

Admin revokes the storage grant.
GPUaaS marks grant inactive.
GPUaaS prevents future credential issuance for that grant.
GPUaaS disables or updates provider principals where practical.
Existing STS credentials expire at TTL unless WEKA supports provider-side session invalidation.
GPUaaS writes audit:
storage.grant.revoke
storage.credential.revoke when provider access is actively revoked

Policy recommendation:

keep user direct-access STS TTL short
use workload-bound credentials that can be revoked on release
rotate provider credentials on suspicious activity

Scenario 11: User Leaves Project¶

Flow:

Tenant/project admin removes user from project.
GPUaaS blocks future storage read/write/list credential issuance in that project.
Any user-owned project prefixes remain project data unless lifecycle policy says otherwise.
Existing direct STS credentials expire by TTL.
Workload credentials are unaffected unless the workload was owned by that user and policy requires release/reassignment.

Scenario 12: Workload Is Released¶

Flow:

Allocation/app release starts.
GPUaaS detaches storage mounts.
GPUaaS disables/deletes workload-bound provider principal or credential.
Output/checkpoint data remains according to bucket lifecycle policy.
GPUaaS writes audit:
storage.mount.detach
storage.credential.revoke

Scenario 13: Provider Drift Is Detected¶

Drift examples:

WEKA service account exists but GPUaaS grant is deleted.
WEKA policy hash differs from GPUaaS compiled policy.
Bucket exists in WEKA but not in GPUaaS.
GPUaaS bucket exists but provider object is missing.

Flow:

Storage reconciler scans provider state.
Reconciler compares provider principals/policies with GPUaaS grant truth.
Drift is surfaced in v3 Storage:
permission_drift
failed_mount
provider_missing
Operator can reconcile from GPUaaS.

Scenario 14: Platform Operator Access¶

Platform operators may need provider-admin operations, but those credentials are not user/runtime credentials.

Flow:

Platform stores WEKA admin/API credential in platform custody.
Storage adapter uses it only for approved provider-admin workflows.
Provider-admin secret is never returned to frontend, workload, or user CLI.
Every provider-admin action is audited.

Policy Propagation Summary¶

For direct S3 clients:

GPUaaS user intent
  -> GPUaaS policy compiler
  -> WEKA STS request with scoped policy
  -> temporary credentials
  -> S3 client uses credentials
  -> WEKA enforces policy

For workloads:

GPUaaS launch intent
  -> workload service account
  -> GPUaaS policy compiler
  -> WEKA service account or STS credentials
  -> node-agent/app receives scoped access
  -> WEKA enforces policy

What We Must Not Do¶

Do not create long-lived WEKA users for every GPUaaS user by default.
Do not use one shared project-wide provider credential for all workloads.
Do not give a project service account implicit access to all project data.
Do not expose provider admin credentials to UI, CLI, workloads, or app code.
Do not make WEKA the source of truth for GPUaaS IAM.
Do not issue direct S3 credentials without an audit record and expiration.

doc/architecture/Storage_Sharing_and_IAM_Model_v1.md
doc/architecture/Storage_Provider_Capability_Model_v1.md
doc/architecture/Service_Account_Model.md
doc/architecture/Platform_Access_Credential_Model_v1.md
doc/product/V3_Mock_To_Production_Data_Parity_v1.md

Storage IAM User Flows v1¶

Purpose¶

Core Vocabulary¶

Authority Boundary¶

Storage Hierarchy¶

Scenario 1: User Opens Storage In The UI¶

Scenario 2: User Creates A Bucket In The UI¶

Scenario 3: User Shares A Dataset With Another Project¶

Scenario 4: User Uses A Shared Bucket In A Launch Wizard¶

Scenario 5: Workload Uses Storage¶

Scenario 6: User Uses GPUaaS CLI For Storage¶

Scenario 7: User Uses Any S3 Client¶

Scenario 8: User Uses WEKA/S3 CLI Directly¶

Scenario 9: Project Automation Uses Storage¶

Scenario 10: Grant Revocation¶

Scenario 11: User Leaves Project¶

Scenario 12: Workload Is Released¶

Scenario 13: Provider Drift Is Detected¶

Scenario 14: Platform Operator Access¶

Policy Propagation Summary¶

What We Must Not Do¶

Related Documents¶