Storage IAM User Flows v1¶
Date: 2026-04-27
Purpose¶
Explain how GPUaaS storage access works from the user, workload, CLI, and provider perspectives. This document answers the recurring questions:
- Does every GPUaaS user become a WEKA user?
- Does every project use one shared service account?
- How does a direct S3 client get access?
- How are shared datasets and artifacts used across projects?
- Who enforces access: GPUaaS or WEKA?
Short answer:
GPUaaS decides access intent.
GPUaaS compiles provider policy.
WEKA issues/enforces scoped provider credentials.
Human users stay in GPUaaS IAM. WEKA identities are provider-derived enforcement principals, usually service accounts or STS sessions.
Core Vocabulary¶
| Term | Meaning |
|---|---|
| GPUaaS user | Human identity authenticated by GPUaaS/OIDC. |
| GPUaaS service account | Project-scoped machine identity used by workloads, apps, and automation. |
| WEKA principal | Provider-side S3 user, service account, or STS session used for enforcement. |
| Storage grant | GPUaaS-owned truth that says a principal may access a bucket/prefix with specific actions. |
| Policy compiler | GPUaaS component that converts storage grants into WEKA IAM/bucket/session policy. |
| STS credentials | Temporary S3 credentials whose provider session carries or references a scoped policy. |
| Owning project | Project that owns quota, billing, lifecycle, deletion authority, and default policy for a bucket. |
| Shared-with project | Project that has explicit read/write access to someone else's bucket or prefix. |
Authority Boundary¶
GPUaaS IAM
- users
- project membership
- service accounts
- storage grants
- audit and policy intent
WEKA IAM / S3
- provider principal/session
- attached or session policy
- S3 request enforcement
- credential expiration
WEKA does not need to know GPUaaS users. It only needs a scoped provider principal or STS session with an enforceable policy.
Storage Hierarchy¶
Storage is project-owned by default:
tenant
project
bucket / namespace
shared/
users/<user-id>/
workloads/<workload-id>/
datasets/<dataset-id>/
checkpoints/<workload-id>/
artifacts/
Example:
Research/users/subash/
Research/users/priya/
Research/datasets/imagenet/
Research/artifacts/llama-3-70b/
A user has separate project-scoped personal areas:
There is no global users/subash/projects/* ownership hierarchy.
Scenario 1: User Opens Storage In The UI¶
- User logs into GPUaaS.
- User selects tenant, project, and region.
- UI calls v3 storage read models.
- GPUaaS returns:
- buckets owned by the active project
- buckets shared with the active project
- owner project
- grants/audiences summary
- attached workloads
- provider capability hints
- UI does not receive WEKA credentials.
Expected UI labels:
Owned by ResearchShared from TrainingRead-only datasetWritable checkpoint outputProvider: WEKA
Scenario 2: User Creates A Bucket In The UI¶
- User opens Storage → New bucket.
- User selects:
- project
- purpose: workspace, dataset, checkpoint, artifact, generic
- quota
- lifecycle
- access defaults
- GPUaaS checks the user's project role.
- GPUaaS creates a project-owned bucket/namespace.
- Provider adapter creates the WEKA-side bucket/filesystem/prefix as needed.
- GPUaaS records storage bucket metadata and provider reference.
- GPUaaS writes audit:
storage.bucket.create
No human WEKA user is created.
Scenario 3: User Shares A Dataset With Another Project¶
Example:
Flow:
- Training project admin opens bucket detail.
- Admin adds grant:
- target: Inference project
- prefix:
datasets/imagenet/ - permissions: read/list
- expiration: optional
- GPUaaS writes a storage grant.
- GPUaaS compiles WEKA policy updates for future provider credentials.
- GPUaaS writes audit:
storage.grant.create
The shared project does not own the bucket. It only has access according to the grant.
Scenario 4: User Uses A Shared Bucket In A Launch Wizard¶
Example:
Flow:
- User opens app/compute launch wizard in the Inference project.
- Storage picker shows:
- Inference-owned writable buckets
- Training-shared read-only artifacts/datasets
- User selects
training:artifacts/llama-3-70b/as read-only input. - GPUaaS checks:
- user can launch in Inference
- Inference has read grant to the Training prefix
- GPUaaS creates/selects a workload service account.
- GPUaaS compiles WEKA policy:
- read/list on shared Training artifact prefix
- write only to Inference output/checkpoint prefix if requested
- Provider credentials are delivered to the workload, not the human user.
- Audit records both the user-caused launch and storage credential issuance.
Scenario 5: Workload Uses Storage¶
Workloads should not run with human credentials.
Flow:
- User launches workload.
- GPUaaS creates a workload-bound service account, for example:
sa_jupyter_wl_123- GPUaaS attaches explicit storage grants to that runtime:
Research/users/subash/wl_123/*read/writeResearch/datasets/imagenet/*read-only- GPUaaS asks WEKA for provider credentials or creates a provider service account with that policy.
- Node-agent or app controller receives only scoped mount/S3 instructions.
- On workload release, GPUaaS revokes or disables provider access.
Important rule:
project-scoped service account = may exist in project
storage access = explicit bucket/prefix grants only
A project service account does not automatically see every user's data.
Scenario 6: User Uses GPUaaS CLI For Storage¶
Example command shape:
gpuaas storage ls --project Research research:users/subash/
gpuaas storage cp ./data.csv research:users/subash/data.csv
Flow:
- CLI uses GPUaaS auth token.
- CLI calls GPUaaS storage APIs.
- GPUaaS checks user IAM and storage grants.
- GPUaaS performs the operation through the storage service or returns a controlled transfer path.
- User never handles WEKA credentials directly.
This is the safest UX for common users because GPUaaS mediates the operation.
Scenario 7: User Uses Any S3 Client¶
Users may need aws s3, boto3, PyTorch data loaders, or other S3-compatible
clients.
Example request:
gpuaas storage credentials issue \
--project Research \
--bucket research \
--prefix users/subash/ \
--mode read-write \
--ttl 1h
Flow:
- User authenticates to GPUaaS.
- User requests temporary S3 credentials for bucket/prefix.
- GPUaaS checks:
- project membership
- storage grants
- requested mode
- TTL policy
- GPUaaS compiles a scoped WEKA policy.
- GPUaaS calls WEKA STS or provider API.
- WEKA issues temporary credentials whose session carries or references the policy.
- GPUaaS returns:
- endpoint
- access key
- secret key
- session token when applicable
- expiration
- allowed bucket/prefix summary
- User configures any S3 client with those credentials.
- WEKA enforces the scoped policy on every S3 request.
The user does not pass policy to the S3 client. The credentials are already bound to the provider session/policy.
Scenario 8: User Uses WEKA/S3 CLI Directly¶
The preferred path is still through GPUaaS-issued temporary credentials.
Example:
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_SESSION_TOKEN=...
aws --endpoint-url https://s3.example.internal \
s3 ls s3://research/users/subash/
WEKA sees only the temporary provider session. GPUaaS knows why it was issued because it records:
credential_issuance_id
user_id
project_id
bucket
prefixes
permissions
expires_at
provider_session_id
policy_hash
correlation_id
Scenario 9: Project Automation Uses Storage¶
Project automation should use GPUaaS service accounts, not human users.
Flow:
- Project admin creates service account:
sa_training_pipeline- Admin grants:
- read on
datasets/imagenet/ - write on
checkpoints/pipeline/ - Automation exchanges service-account credentials for a GPUaaS token.
- Automation requests temporary provider credentials or uses GPUaaS storage APIs.
- GPUaaS logs actor as service account.
Scenario 10: Grant Revocation¶
Example:
Flow:
- Admin revokes the storage grant.
- GPUaaS marks grant inactive.
- GPUaaS prevents future credential issuance for that grant.
- GPUaaS disables or updates provider principals where practical.
- Existing STS credentials expire at TTL unless WEKA supports provider-side session invalidation.
- GPUaaS writes audit:
storage.grant.revokestorage.credential.revokewhen provider access is actively revoked
Policy recommendation:
- keep user direct-access STS TTL short
- use workload-bound credentials that can be revoked on release
- rotate provider credentials on suspicious activity
Scenario 11: User Leaves Project¶
Flow:
- Tenant/project admin removes user from project.
- GPUaaS blocks future storage read/write/list credential issuance in that project.
- Any user-owned project prefixes remain project data unless lifecycle policy says otherwise.
- Existing direct STS credentials expire by TTL.
- Workload credentials are unaffected unless the workload was owned by that user and policy requires release/reassignment.
Scenario 12: Workload Is Released¶
Flow:
- Allocation/app release starts.
- GPUaaS detaches storage mounts.
- GPUaaS disables/deletes workload-bound provider principal or credential.
- Output/checkpoint data remains according to bucket lifecycle policy.
- GPUaaS writes audit:
storage.mount.detachstorage.credential.revoke
Scenario 13: Provider Drift Is Detected¶
Drift examples:
- WEKA service account exists but GPUaaS grant is deleted.
- WEKA policy hash differs from GPUaaS compiled policy.
- Bucket exists in WEKA but not in GPUaaS.
- GPUaaS bucket exists but provider object is missing.
Flow:
- Storage reconciler scans provider state.
- Reconciler compares provider principals/policies with GPUaaS grant truth.
- Drift is surfaced in v3 Storage:
permission_driftfailed_mountprovider_missing- Operator can reconcile from GPUaaS.
Scenario 14: Platform Operator Access¶
Platform operators may need provider-admin operations, but those credentials are not user/runtime credentials.
Flow:
- Platform stores WEKA admin/API credential in platform custody.
- Storage adapter uses it only for approved provider-admin workflows.
- Provider-admin secret is never returned to frontend, workload, or user CLI.
- Every provider-admin action is audited.
Policy Propagation Summary¶
For direct S3 clients:
GPUaaS user intent
-> GPUaaS policy compiler
-> WEKA STS request with scoped policy
-> temporary credentials
-> S3 client uses credentials
-> WEKA enforces policy
For workloads:
GPUaaS launch intent
-> workload service account
-> GPUaaS policy compiler
-> WEKA service account or STS credentials
-> node-agent/app receives scoped access
-> WEKA enforces policy
What We Must Not Do¶
- Do not create long-lived WEKA users for every GPUaaS user by default.
- Do not use one shared project-wide provider credential for all workloads.
- Do not give a project service account implicit access to all project data.
- Do not expose provider admin credentials to UI, CLI, workloads, or app code.
- Do not make WEKA the source of truth for GPUaaS IAM.
- Do not issue direct S3 credentials without an audit record and expiration.
Related Documents¶
doc/architecture/Storage_Sharing_and_IAM_Model_v1.mddoc/architecture/Storage_Provider_Capability_Model_v1.mddoc/architecture/Service_Account_Model.mddoc/architecture/Platform_Access_Credential_Model_v1.mddoc/product/V3_Mock_To_Production_Data_Parity_v1.md