Personas & journeys¶
Decided
Source:
doc/product/UX_Journeys.md v0.4 · 332 lines · target-state baseline (prototype-informed but not prototype-bound)
Personas¶
| Persona | Primary tasks | Surfaces used |
|---|---|---|
| End User | Provision and operate GPUs | /workloads, /infra/*, /apps/*, /billing, /access/* |
| Admin | Monitor and manage users/nodes | /admin/*, /ops/* |
| Billing Operator | Inspect usage and payment outcomes | /admin/payment-sessions, /admin/audit-logs, /admin/users |
Journey: Authentication¶
flowchart TB
A[User opens /login] --> B{Account type?}
B -- Work --> C[Continue with SSO<br/>redirect to provider]
B -- Personal --> D[Sign in / sign up]
C --> E[Returns with token<br/>session bootstrapped]
D --> E
E --> F{First-run?<br/>balance==0 and no allocs?}
F -- yes --> G[Route to /billing<br/>'Add funds to get started']
F -- no --> H[Load profile, nodes, allocations]
Contract mapping:
GET /api/v1/auth/oidc/authorizePOST /api/v1/auth/oidc/exchangePOST /api/v1/auth/token/refreshPOST /api/v1/auth/logoutPOST /api/v1/auth/personal/loginPOST /api/v1/auth/personal/signup
Journey: Launch Workload¶
flowchart LR
A["/infra/catalog or top-bar Launch"] --> B[Hardware<br/>SKU + GPU count]
B --> C[Image]
C --> D[Network]
D --> E[Access<br/>SSH keys]
E --> F[Storage attachments]
F --> G[Review with price]
G --> H[Single confirm →<br/>POST /allocations]
H --> I["/tasks/{id} provisioning view"]
I --> J["/workloads"]
UX requirements (from UX_Journeys.md §4):
- Clear precheck failures (sold out, insufficient balance, node offline).
- Sold-out SKU cards render as
Unavailable(disabled provision action). - Launch shows estimated hourly cost before confirmation.
- Launch is a full-page wizard, not a modal.
- Slice GPU counts validated against same-node topology before submit.
- Pricing mode and idle-suspend controls may be visible but disabled until backend supports them.
Journey: Operate Active Allocation¶
flowchart LR
A[/workloads/] --> B{Action}
B -- Metrics --> M[In-page overlay using<br/>platform metrics URL]
B -- Console --> T[Terminal modal:<br/>mint terminal-token →<br/>WS connect]
B -- Key --> K[Manage SSH keys<br/>copy SSH command]
B -- Release --> R[Destructive<br/>confirmation modal]
UX requirements:
- Active vs released states visually distinct.
- Release confirmation explicit and cancellable.
- Failed allocation shows machine-readable
failure_reason. release_failedshowsrelease_failed_reason, confirms billing stopped, exposes retry.- Metrics URL is platform-level frontend config, not per-node
metrics_urlAPI field. - SSH command is copy-ready from allocation connection details.
- No persistent server-side private-key download endpoint.
- Returning from detail to list restores prior list filters, search, pagination.
- If an action changes allocation state, the destination state must be discoverable immediately.
Journey: Billing & Payments¶
flowchart LR
A[/billing/] --> B[Balance + projected runway]
B --> C[Add funds]
C --> D[Stripe checkout]
D --> E[Webhook → ledger credit]
E --> F[Balance updated]
A --> G[Usage timeline]
A --> H[Refund history]
H --> I[Refund request]
I --> J{within window?}
J -- yes --> K[Provider refund]
J -- no --> L[Internal credit]
Journey: Admin — Force release¶
flowchart LR
A[/admin/allocations/] --> B[Filter by state]
B --> C{Found release_failed?}
C -- yes --> D[Open allocation detail]
D --> E[Force release<br/>requires reason]
E --> F[Audit row written +<br/>provisioning.force_release_requested emitted]
Journey: Admin — Onboard slice node¶
flowchart LR
A["/admin/nodes"] --> B[Add node]
B --> C[Node enrolls<br/>via bootstrap script]
C --> D["POST /admin/nodes/{id}/slice-topology/discovery"]
D --> E[Review candidate slot map]
E --> F{Map satisfies invariants?<br/>per-slot VF, NVMe ownership, NUMA}
F -- no --> G[Fix host config<br/>re-run discovery]
F -- yes --> H[Approve into node_resource_slots]
H --> I[Node active for slice scheduling]
Accessibility baseline¶
From UX_Implementation_Spec.md:
- Keyboard navigation, focus trap, ARIA labels.
- Contrast checks (WCAG AA).
- Explicit handling for
401,403,404,409,429states per surface. - Every async state has loading / empty / error / success / restricted / rate_limited renderings.