UX Implementation Spec v0.2 (Coding Baseline)¶
Purpose: - Translate product UX journeys into implementation-ready UI specifications. - Enforce API-first UX where every interaction maps to OpenAPI/AsyncAPI contracts.
Inputs:
- doc/product/PRD.md
- doc/product/UX_Journeys.md
- doc/product/UX_Redesign_Implementation_Plan.md
- doc/product/Admin_Ops_and_User_IA_v1.md
- doc/product/Brand_Guidelines.md
- doc/product/ux-mocks/*
- doc/api/openapi.draft.yaml
- doc/api/asyncapi.draft.yaml
- doc/governance/UX_Contract_Gate.md
Normative UX references for the current redesign:
- doc/product/ux-mocks/product-redesign-v3/index.html
- doc/product/ux-mocks/product-redesign-v3/README.md
- doc/product/UX_Redesign_Implementation_Plan.md
- doc/product/V3_Admin_Workbench_Model_v1.md
Interpretation rule:
- The route model, page-family taxonomy, shared component vocabulary, and
interaction standards in the redesign plan are normative for new pages and
major page refactors.
- The mock HTML is a layout and visual reference, not the behavioral source of
truth.
- Where this spec and the redesign plan disagree, the redesign plan wins.
- For /v3-prod/platform/* family landings, admin workbenches, focused
details, operation drawers, and acknowledgement/suppression behavior,
doc/product/V3_Admin_Workbench_Model_v1.md is the governing page-family
contract.
Mock-to-production copy boundary: - Descriptions in production pages serve the user's task, not the reviewer's understanding of the mock. - Remove permanent copy that restates the title or explains design rationale. - Put non-obvious context behind an info/help affordance. - Put first-time guidance in empty states, decision prose in confirmations, and longer procedures in runbooks or docs. - Permanent section subtitles are a warning sign that the page is still talking to design reviewers instead of acting like an operator tool.
Page-growth rule:
- For any major new page, major page refactor, new top-level/local navigation
area, or multi-step workflow, do not start broad implementation until the
following are explicit:
- role or persona using the surface;
- scope of control (self, project, tenant, node, fleet,
platform, or public);
- primary user intent;
- page family;
- shell/local-navigation placement;
- supporting mock/spec updates where the current design is not already
settled.
- This rule exists to prevent repeated redesign loops as the product grows. If a
surface does not fit cleanly into the current IA, update the IA and mock
package first instead of implementing another one-off page.
1. Route and Screen Inventory¶
User-facing¶
/auth/login(tabbed sign-in entry: Work account + Personal account)/(Control Room)/workloads(operational workload list)/workloads/:workload_id(workload detail + actions)/infra/catalog(SKU/catalog browse)/infra/allocations(raw allocations workbench)/infra/allocations/:allocation_id(allocation detail / redirect surface)/launch(full-page launch wizard)/tasks/:task_id(async task/provisioning detail)/billing(balance, usage table, CSV export)/infra/storage(list/upload/download/mkdir/rename/delete)/access/projects(project directory and lifecycle)/access/projects/:project_id(project detail and membership/access context)/access/team/access/ssh-keys/access/api-keys/access/policies/developer/docs(developer docs center)/developer/downloads(downloads and release artifacts)/developer/api-docs/swagger(embedded API reference)/developer/api-docs/redoc(embedded API reference)
Admin-facing¶
/admin/overview/ops/attention/ops/telemetry/admin/users/admin/users/:user_id/admin/nodes/admin/skus/admin/os-images/admin/quotas/admin/maas/admin/allocations/admin/audit-logs/admin/payments/sessions
Admin IA follow-up:
- doc/product/Admin_Ops_and_User_IA_v1.md
2. Contract Traceability (Critical Actions)¶
Auth¶
- Work authorize start:
GET /api/v1/auth/oidc/authorize(optional advisorytenant_hint/identity_hint) - Work exchange:
POST /api/v1/auth/oidc/exchange - Personal sign-in:
POST /api/v1/auth/personal/login - Personal sign-up:
POST /api/v1/auth/personal/signup - Session renew:
POST /api/v1/auth/token/refresh - Logout:
POST /api/v1/auth/logout
User SSH Keys¶
- List keys:
GET /api/v1/ssh-keys - Add key:
POST /api/v1/ssh-keys - Remove key:
DELETE /api/v1/ssh-keys/{key_id}
Allocations¶
- Create:
POST /api/v1/allocations - List:
GET /api/v1/allocations - Detail:
GET /api/v1/allocations/{allocation_id} - Release:
POST /api/v1/allocations/{allocation_id}/release - Terminal token:
POST /api/v1/allocations/{allocation_id}/terminal-token - Terminal realtime:
WS /ws/terminal/{allocation_id}
Tasks / Async surfaces¶
- Task detail: contract surface required for
/tasks/:task_id - Allocation/event timeline pagination must use backend cursor-based pagination where available; UI must not synthesize fake client-only pages for server-side event streams
Projects¶
- List:
GET /api/v1/projects - Create:
POST /api/v1/projects - Update:
PATCH /api/v1/projects/{project_id} - Delete:
DELETE /api/v1/projects/{project_id} - Set default:
POST /api/v1/projects/{project_id}/set-default
Billing + Payments¶
- Balance:
GET /api/v1/billing/balance - Usage:
GET /api/v1/billing/usagewith optionalallocation_id,from,to,sort - Usage CSV:
GET /api/v1/billing/usage/csvwith optionalallocation_id,from,to,sort - Checkout session:
POST /api/v1/payments/checkout-session - Customer portal:
POST /api/v1/payments/customer-portal-session
Storage¶
GET /api/v1/storage/listGET /api/v1/storage/downloadPUT /api/v1/storage/uploadPOST /api/v1/storage/mkdirPOST /api/v1/storage/renameDELETE /api/v1/storage/delete
Admin¶
- Overview:
GET /api/v1/admin/overview - Ops (pre-coding contract required):
GET /api/v1/admin/ops/overview - Runbook catalog (pre-coding contract required):
GET /api/v1/admin/runbooks,GET /api/v1/admin/runbooks/{runbook_id} - Users list/detail/create/balance/refunds
- Nodes list/create/probe/delete
- Allocations list/force-release
- Audit logs list/export
- Payments sessions:
GET /api/v1/admin/payments/sessions
Admin role boundary:
- /admin/* in current MVP is platform-admin scope (users.role=admin).
- Tenant-admin features must use tenant membership roles (tenant_memberships.role) and tenant-scoped routes/surfaces.
3. State Matrix (Required for Every Screen)¶
loading: skeleton/spinner with timeout fallback message.empty: explicit empty state copy + next action CTA.error: structured error display usingErrorResponse.code/message; include correlation id if present.success: action-confirmation toast/banner.restricted: role/tenant denial state for403.rate_limited:429withRetry-Aftercountdown and retry action.
4. Shared Interaction Contract¶
This section is normative for all list, detail, task, and navigation surfaces. Do not introduce page-specific behavior where a shared rule exists here.
4.1 List and Workbench pages¶
Applies to:
- /workloads
- /infra/allocations
- /admin/allocations
- /admin/users
- /admin/nodes
- /admin/audit-logs
- billing/workload cost tables and any future evidence/config tables
Rules:
- Every list page uses a shared WorkbenchPage pattern:
- PageHeader
- summary cards when relevant
- search/filter toolbar
- segmented status filter
- data table or mobile card fallback
- Workbench controls remain visible even when the current result set is empty.
An empty result is rendered inside the content region, not as a full-page
replacement that hides filters or view controls.
- Sort belongs on column headers, not in ad hoc page-specific menus.
- Filter controls belong in the toolbar above the table, not embedded in
headers.
- Primary row action is visible in-row as a button, usually Open.
- Secondary row actions live in a shared MoreActions menu.
- Dangerous actions in MoreActions are visually separated and require
confirmation.
- Pagination is server-driven where the backend supports cursors. The UI must
preserve sort/filter/search state when paging.
- Empty, loading, restricted, and error states use shared system components.
State-result rules:
- no active items must not prevent the user from switching to All,
Provisioning, Released, Failed, or other relevant statuses.
- Workbench pages that group resources by status (for example workloads) must
still provide a stable way to inspect non-active items and history.
- Mutations that change a resource status must move the row predictably to the
destination status grouping/filter. They must not appear to vanish without the
UI explaining which status/view now contains the row.
4.2 Data table behavior¶
Rules: - Header labels are uppercase, compact, and muted. - Sortable columns show one active sort at a time with direction indicator. - Default sort is newest-first for time-based operational lists unless a screen explicitly documents otherwise. - Status filters use a segmented control with stable labels. Do not rename the same status on different pages. - Search input text and filter selections persist in the URL where practical so deep links and back-navigation restore state. - Dropdown menus must render in a portal and must not be clipped by table scroll containers. - Mobile layout must collapse to cards while preserving the same actions and statuses. - Long identifiers may be visually truncated in the table, but every truncated value must have at least one full-detail path: copy action, tooltip, expanded cell, or detail drilldown.
Evidence/investigation rules:
- Audit and evidence tables are scan-oriented, but scanning alone is not
sufficient. Each row must support a detail view via row click, Open, side
panel, drawer, or detail page.
- The evidence detail view must surface full actor, target, correlation ID,
timestamps, result, and raw metadata payload where allowed.
- Investigation pages should separate compact table columns from full-fidelity
event detail instead of trying to fit all data into the row itself.
4.3 Detail pages¶
Rules:
- All resource detail pages use the shared DetailPage pattern:
- PageHeader
- facts row
- tabs
- two-column body where appropriate
- Resource ID and human-readable name must be copyable.
- Tabs must be ordered consistently for the same resource class.
- The "back" action returns to the previous list state when navigated from a
list, including filters and pagination state.
4.4 Task and async pages¶
Rules:
- Long-running operations use /tasks/:task_id as the first-class status view.
- Transitional resources may link to their active task, but task progress
should not be hidden inside detail pages.
- Timelines are newest-first for event history and use server-backed pagination
for long event streams.
- Progress copy must be explicit about safe close/reopen behavior.
4.5 Navigation and state restoration¶
Rules:
- List-to-detail-to-list navigation must restore the prior list view state.
- Primary navigation labels, ordering, and grouping follow the page-family model
in UX_Redesign_Implementation_Plan.md.
- Launch is a full-page flow without the sidebar and should resume at the last
valid step when re-opened in the same session if draft state exists.
Project context rules:
- Workspace/project selection in the top context bar is not sufficient as the
only project UX.
- Users need a first-class project management surface to create, rename, review,
set default, and delete projects where policy allows.
- Project-switcher actions should deep-link into /access/projects rather than
becoming an overloaded management UI inside the dropdown.
Developer/docs rules: - Developer docs must be a first-class routed product surface, not a placeholder landing page. - API reference pages should render inside the product shell or a shell-aligned embedded frame by default, not as a separate browser-window flow. - External doc destinations may still exist, but they are secondary and should be clearly marked as leaving the product experience. - Downloads, guides, API reference, examples, and app-developer onboarding should feel like one connected developer area.
4.6 Admin experience¶
Rules:
- Admin pages must follow workflow-family grouping, not a flat entity dump.
- Admin landing pages start with summary and action-required context before raw
tables.
- Lifecycle, IAM, config, evidence, and finance pages may each have different
density, but they must still follow the shared table/detail/task contracts.
- The admin-specific workflow model in Admin_Ops_and_User_IA_v1.md is
normative for admin route refactors.
4.7 App shell extensibility¶
Rules: - App-specific UI extends the shared product shell rather than replacing it. - App flows inherit the same page header, toolbar, workbench, detail, task, and navigation contracts as core product pages unless a documented extension point allows a controlled variance. - App-related developer surfaces should connect cleanly to docs, downloads, API references, service accounts, access credentials, and project context. - Placeholder developer/app routes are product gaps and should be tracked as missing product surfaces, not left as acceptable steady-state behavior. - App workload pages must follow the same state/filter/result behavior as core allocation pages. A failed or stopped app should move into a predictable status bucket, not disappear because the current page only renders active groups with no status-switch affordance.
5. Async UX Rules¶
Provisioning lifecycle UI¶
- After create allocation, UI immediately shows
requestedstate row/card. - Poll
GET /api/v1/allocations/{allocation_id}until terminal state or user leaves page. - Show explicit transitions:
requested -> provisioning -> active. - If the backend exposes a task identifier, route the user to
/tasks/:task_idwhile also surfacing the in-flight workload/allocation in workbench views. - On
failed, renderfailure_reasonprominently with retry path. - Card state spec:
requested: spinner + text "Request accepted, waiting for worker".provisioning: spinner + text "Configuring node and SSH access".active: green status chip + full action row enabled.failed: red status chip +failure_reason+ retry CTA.- Long-running hint: if allocation remains
requestedorprovisioningfor 90 seconds, show non-blocking "Taking longer than expected" notice with support hint. - Navigation behavior: provisioning cards must persist across route changes and page reload via server state polling.
Release lifecycle UI¶
- Release action returns accepted async state.
- UI updates to
releasingimmediately and continues polling. - Final states:
releasedorfailed(with reason and admin/support hint). release_failedUX is mandatory:- Card status chip
release_failedwith clear message fromrelease_failed_reason. - Show explicit billing note: "Billing has stopped for this allocation."
- Show retry action (
POST /api/v1/allocations/{allocation_id}/release) and admin escalation guidance.
Terminal UX¶
- UI requests terminal token immediately before opening WS.
- Token is single-use and short-lived; failed connect requires remint.
- UI handles
TerminalControlframes (session_ready,session_error,session_closed) in addition to terminal stream output. - Disconnect states: retry (remint token) and close.
- Allocation card action row layout is fixed:
Metrics/Console/Key/Releaseicon buttons in one row. - Active allocation views must show copy-ready SSH command:
ssh <username_on_node>@<host> -p <port> -i <downloaded_key_path>- Key action should route to user key management (public keys) and SSH command help.
Notification UX¶
- In-app notification payload requires:
id,occurred_at,severity,type,title,message. - Optional
action_urlmaps to deep links (billing page, allocation detail, admin view). - Notification shell requirements:
- top-nav bell icon with unread badge.
- panel lists latest notifications with severity color and timestamp.
- clicking an item follows
action_urlwhen present. - notifications persist in panel until dismissed/expired by retention policy.
Billing + Payments UX¶
- Low balance must be persistent, not toast-only:
- top-level banner on all authenticated pages when below threshold.
- billing page balance card changes state (
ok/warning/depleted). - Stripe return handling must be explicit:
- billing route (or
/billing/return) handles return query params and renders success/cancel/failure status. - on return, refresh balance and usage queries.
- Provision modal must show cost estimate before confirmation:
<node_count> x <gpus_per_node> x <price_per_gpu_per_hour> = <estimated_hourly_total>.
Admin Dashboard UX¶
- Admin overview auto-refresh every 5 seconds by default.
- Show "last updated" relative time and expose pause/resume auto-refresh toggle.
- Admin ops must expose contextual runbook links for degraded states using stable
runbook_idmappings. - Runbook panel design and metadata flow follow
doc/operations/Ops_Runbook_Architecture.md.
6. Accessibility Baseline¶
- Keyboard-only navigation across all interactive controls.
- Modal focus trap and escape behavior.
aria-labelfor icon-only actions.- Error messages announced via accessible live region.
- Contrast meets WCAG AA for body text and actionable controls.
7. Responsive Baseline¶
- Breakpoints: mobile, tablet, desktop.
- Data tables provide mobile-safe condensed/card fallback.
- Critical actions remain reachable without hover interactions.
- Theme behavior follows the redesign token system and product direction; avoid page-specific theme overrides that break cross-page consistency.
8. UX Completion Checklist (Pre-coding Gate)¶
- Screen inventory finalized with ownership.
- Low-fidelity screen mocks completed under
doc/product/ux-mocks/. - Contract traceability complete for all critical actions.
- State matrix implemented in UI component spec.
- Async lifecycle UX documented and reviewed.
- Accessibility baseline accepted.
- Responsive baseline accepted.
9. Shared UX Foundation Packages (Build Before Feature Screens)¶
API and Session Layer¶
packages/web/src/lib/api/client.ts: typed contract client wrapper, auth header handling, correlation-id forwarding.packages/web/src/lib/api/errors.ts: map backendErrorResponse-> UX-safe messages/actions.packages/web/src/lib/session/*: user/session/role state and protected route guard helpers.packages/web/src/lib/api/rateLimit.ts: shared429handling that readsRetry-After, exposes countdown state, and standard retry helpers.
Query and Data Fetching Layer¶
packages/web/src/lib/query/keys.ts: stable query key conventions.packages/web/src/lib/query/defaults.ts: retry/stale/caching defaults by data class.packages/web/src/lib/query/pagination.ts: cursor/page_size helpers.
Shared UI System Layer¶
packages/web/src/components/system/LoadingState.tsxpackages/web/src/components/system/EmptyState.tsxpackages/web/src/components/system/ErrorState.tsxpackages/web/src/components/system/RestrictedState.tsxpackages/web/src/components/system/RateLimitedState.tsxpackages/web/src/components/system/ConfirmActionModal.tsxpackages/web/src/components/system/AlertModal.tsx(severity variants:info/success/warning/error)packages/web/src/components/system/DeletionModal.tsx(destructive confirm with optional reason input)packages/web/src/components/system/CursorTable.tsxpackages/web/src/components/system/NotificationBell.tsxpackages/web/src/components/system/NotificationPanel.tsxpackages/web/src/components/product/PageHeader.tsxpackages/web/src/components/product/WorkbenchPage.tsxpackages/web/src/components/product/DetailPage.tsxpackages/web/src/components/product/DataTable.tsxpackages/web/src/components/product/MoreActions.tsxpackages/web/src/components/product/StatusChip.tsxpackages/web/src/components/product/Timeline.tsx
Accessibility Utilities¶
packages/web/src/components/a11y/FocusTrap.tsxpackages/web/src/components/a11y/LiveRegion.tsxpackages/web/src/components/a11y/useKeyboardShortcut.ts
Styling and Tokens¶
packages/web/src/styles/tokens.css(or equivalent token source)packages/web/src/styles/theme.css
10. Vertical Slice UX Delivery Order (With API)¶
- Auth + Profile slice
- UX: tabbed sign-in (
Work accountvsPersonal account), redirect/exchange/logout, protected layout, expired-session UX. - API: auth endpoints +
GET /users/me. -
Include first-run onboarding redirect when balance is zero and no allocations exist.
-
Control Room + Catalog + Workloads read slice
- UX: dashboard, SKU discovery, operational workload listing/detail.
- API:
GET /skus,GET /nodes, allocation read endpoints. -
Include explicit sold-out SKU card state (
Unavailable, disabled provision CTA). -
Launch + Tasks + Provision/Release + Terminal slice
- UX: stepped launch wizard, provisioning task progression, release progression, terminal connect/reconnect states.
-
API: allocation create/release + terminal token + terminal WS.
-
Billing + Payments slice
- UX: balance/usage, CSV export, checkout/portal redirect outcomes.
-
API: billing + payments endpoints.
-
Admin slice
- UX: users/nodes/allocations/audit admin pages with filters/actions.
-
API: admin endpoints.
-
Storage slice
- UX: file explorer CRUD and safety/confirmation patterns.
- API: storage endpoints.
Delivery rule: - For multi-service screens, implement panel-level degraded states and progressive enablement instead of blocking the whole page.