Skip to content

UX Implementation Spec v0.2 (Coding Baseline)

Purpose: - Translate product UX journeys into implementation-ready UI specifications. - Enforce API-first UX where every interaction maps to OpenAPI/AsyncAPI contracts.

Inputs: - doc/product/PRD.md - doc/product/UX_Journeys.md - doc/product/UX_Redesign_Implementation_Plan.md - doc/product/Admin_Ops_and_User_IA_v1.md - doc/product/Brand_Guidelines.md - doc/product/ux-mocks/* - doc/api/openapi.draft.yaml - doc/api/asyncapi.draft.yaml - doc/governance/UX_Contract_Gate.md

Normative UX references for the current redesign: - doc/product/ux-mocks/product-redesign-v3/index.html - doc/product/ux-mocks/product-redesign-v3/README.md - doc/product/UX_Redesign_Implementation_Plan.md - doc/product/V3_Admin_Workbench_Model_v1.md

Interpretation rule: - The route model, page-family taxonomy, shared component vocabulary, and interaction standards in the redesign plan are normative for new pages and major page refactors. - The mock HTML is a layout and visual reference, not the behavioral source of truth. - Where this spec and the redesign plan disagree, the redesign plan wins. - For /v3-prod/platform/* family landings, admin workbenches, focused details, operation drawers, and acknowledgement/suppression behavior, doc/product/V3_Admin_Workbench_Model_v1.md is the governing page-family contract.

Mock-to-production copy boundary: - Descriptions in production pages serve the user's task, not the reviewer's understanding of the mock. - Remove permanent copy that restates the title or explains design rationale. - Put non-obvious context behind an info/help affordance. - Put first-time guidance in empty states, decision prose in confirmations, and longer procedures in runbooks or docs. - Permanent section subtitles are a warning sign that the page is still talking to design reviewers instead of acting like an operator tool.

Page-growth rule: - For any major new page, major page refactor, new top-level/local navigation area, or multi-step workflow, do not start broad implementation until the following are explicit: - role or persona using the surface; - scope of control (self, project, tenant, node, fleet, platform, or public); - primary user intent; - page family; - shell/local-navigation placement; - supporting mock/spec updates where the current design is not already settled. - This rule exists to prevent repeated redesign loops as the product grows. If a surface does not fit cleanly into the current IA, update the IA and mock package first instead of implementing another one-off page.

1. Route and Screen Inventory

User-facing

  • /auth/login (tabbed sign-in entry: Work account + Personal account)
  • / (Control Room)
  • /workloads (operational workload list)
  • /workloads/:workload_id (workload detail + actions)
  • /infra/catalog (SKU/catalog browse)
  • /infra/allocations (raw allocations workbench)
  • /infra/allocations/:allocation_id (allocation detail / redirect surface)
  • /launch (full-page launch wizard)
  • /tasks/:task_id (async task/provisioning detail)
  • /billing (balance, usage table, CSV export)
  • /infra/storage (list/upload/download/mkdir/rename/delete)
  • /access/projects (project directory and lifecycle)
  • /access/projects/:project_id (project detail and membership/access context)
  • /access/team
  • /access/ssh-keys
  • /access/api-keys
  • /access/policies
  • /developer/docs (developer docs center)
  • /developer/downloads (downloads and release artifacts)
  • /developer/api-docs/swagger (embedded API reference)
  • /developer/api-docs/redoc (embedded API reference)

Admin-facing

  • /admin/overview
  • /ops/attention
  • /ops/telemetry
  • /admin/users
  • /admin/users/:user_id
  • /admin/nodes
  • /admin/skus
  • /admin/os-images
  • /admin/quotas
  • /admin/maas
  • /admin/allocations
  • /admin/audit-logs
  • /admin/payments/sessions

Admin IA follow-up: - doc/product/Admin_Ops_and_User_IA_v1.md

2. Contract Traceability (Critical Actions)

Auth

  • Work authorize start: GET /api/v1/auth/oidc/authorize (optional advisory tenant_hint/identity_hint)
  • Work exchange: POST /api/v1/auth/oidc/exchange
  • Personal sign-in: POST /api/v1/auth/personal/login
  • Personal sign-up: POST /api/v1/auth/personal/signup
  • Session renew: POST /api/v1/auth/token/refresh
  • Logout: POST /api/v1/auth/logout

User SSH Keys

  • List keys: GET /api/v1/ssh-keys
  • Add key: POST /api/v1/ssh-keys
  • Remove key: DELETE /api/v1/ssh-keys/{key_id}

Allocations

  • Create: POST /api/v1/allocations
  • List: GET /api/v1/allocations
  • Detail: GET /api/v1/allocations/{allocation_id}
  • Release: POST /api/v1/allocations/{allocation_id}/release
  • Terminal token: POST /api/v1/allocations/{allocation_id}/terminal-token
  • Terminal realtime: WS /ws/terminal/{allocation_id}

Tasks / Async surfaces

  • Task detail: contract surface required for /tasks/:task_id
  • Allocation/event timeline pagination must use backend cursor-based pagination where available; UI must not synthesize fake client-only pages for server-side event streams

Projects

  • List: GET /api/v1/projects
  • Create: POST /api/v1/projects
  • Update: PATCH /api/v1/projects/{project_id}
  • Delete: DELETE /api/v1/projects/{project_id}
  • Set default: POST /api/v1/projects/{project_id}/set-default

Billing + Payments

  • Balance: GET /api/v1/billing/balance
  • Usage: GET /api/v1/billing/usage with optional allocation_id, from, to, sort
  • Usage CSV: GET /api/v1/billing/usage/csv with optional allocation_id, from, to, sort
  • Checkout session: POST /api/v1/payments/checkout-session
  • Customer portal: POST /api/v1/payments/customer-portal-session

Storage

  • GET /api/v1/storage/list
  • GET /api/v1/storage/download
  • PUT /api/v1/storage/upload
  • POST /api/v1/storage/mkdir
  • POST /api/v1/storage/rename
  • DELETE /api/v1/storage/delete

Admin

  • Overview: GET /api/v1/admin/overview
  • Ops (pre-coding contract required): GET /api/v1/admin/ops/overview
  • Runbook catalog (pre-coding contract required): GET /api/v1/admin/runbooks, GET /api/v1/admin/runbooks/{runbook_id}
  • Users list/detail/create/balance/refunds
  • Nodes list/create/probe/delete
  • Allocations list/force-release
  • Audit logs list/export
  • Payments sessions: GET /api/v1/admin/payments/sessions

Admin role boundary: - /admin/* in current MVP is platform-admin scope (users.role=admin). - Tenant-admin features must use tenant membership roles (tenant_memberships.role) and tenant-scoped routes/surfaces.

3. State Matrix (Required for Every Screen)

  • loading: skeleton/spinner with timeout fallback message.
  • empty: explicit empty state copy + next action CTA.
  • error: structured error display using ErrorResponse.code/message; include correlation id if present.
  • success: action-confirmation toast/banner.
  • restricted: role/tenant denial state for 403.
  • rate_limited: 429 with Retry-After countdown and retry action.

4. Shared Interaction Contract

This section is normative for all list, detail, task, and navigation surfaces. Do not introduce page-specific behavior where a shared rule exists here.

4.1 List and Workbench pages

Applies to: - /workloads - /infra/allocations - /admin/allocations - /admin/users - /admin/nodes - /admin/audit-logs - billing/workload cost tables and any future evidence/config tables

Rules: - Every list page uses a shared WorkbenchPage pattern: - PageHeader - summary cards when relevant - search/filter toolbar - segmented status filter - data table or mobile card fallback - Workbench controls remain visible even when the current result set is empty. An empty result is rendered inside the content region, not as a full-page replacement that hides filters or view controls. - Sort belongs on column headers, not in ad hoc page-specific menus. - Filter controls belong in the toolbar above the table, not embedded in headers. - Primary row action is visible in-row as a button, usually Open. - Secondary row actions live in a shared MoreActions menu. - Dangerous actions in MoreActions are visually separated and require confirmation. - Pagination is server-driven where the backend supports cursors. The UI must preserve sort/filter/search state when paging. - Empty, loading, restricted, and error states use shared system components.

State-result rules: - no active items must not prevent the user from switching to All, Provisioning, Released, Failed, or other relevant statuses. - Workbench pages that group resources by status (for example workloads) must still provide a stable way to inspect non-active items and history. - Mutations that change a resource status must move the row predictably to the destination status grouping/filter. They must not appear to vanish without the UI explaining which status/view now contains the row.

4.2 Data table behavior

Rules: - Header labels are uppercase, compact, and muted. - Sortable columns show one active sort at a time with direction indicator. - Default sort is newest-first for time-based operational lists unless a screen explicitly documents otherwise. - Status filters use a segmented control with stable labels. Do not rename the same status on different pages. - Search input text and filter selections persist in the URL where practical so deep links and back-navigation restore state. - Dropdown menus must render in a portal and must not be clipped by table scroll containers. - Mobile layout must collapse to cards while preserving the same actions and statuses. - Long identifiers may be visually truncated in the table, but every truncated value must have at least one full-detail path: copy action, tooltip, expanded cell, or detail drilldown.

Evidence/investigation rules: - Audit and evidence tables are scan-oriented, but scanning alone is not sufficient. Each row must support a detail view via row click, Open, side panel, drawer, or detail page. - The evidence detail view must surface full actor, target, correlation ID, timestamps, result, and raw metadata payload where allowed. - Investigation pages should separate compact table columns from full-fidelity event detail instead of trying to fit all data into the row itself.

4.3 Detail pages

Rules: - All resource detail pages use the shared DetailPage pattern: - PageHeader - facts row - tabs - two-column body where appropriate - Resource ID and human-readable name must be copyable. - Tabs must be ordered consistently for the same resource class. - The "back" action returns to the previous list state when navigated from a list, including filters and pagination state.

4.4 Task and async pages

Rules: - Long-running operations use /tasks/:task_id as the first-class status view. - Transitional resources may link to their active task, but task progress should not be hidden inside detail pages. - Timelines are newest-first for event history and use server-backed pagination for long event streams. - Progress copy must be explicit about safe close/reopen behavior.

4.5 Navigation and state restoration

Rules: - List-to-detail-to-list navigation must restore the prior list view state. - Primary navigation labels, ordering, and grouping follow the page-family model in UX_Redesign_Implementation_Plan.md. - Launch is a full-page flow without the sidebar and should resume at the last valid step when re-opened in the same session if draft state exists.

Project context rules: - Workspace/project selection in the top context bar is not sufficient as the only project UX. - Users need a first-class project management surface to create, rename, review, set default, and delete projects where policy allows. - Project-switcher actions should deep-link into /access/projects rather than becoming an overloaded management UI inside the dropdown.

Developer/docs rules: - Developer docs must be a first-class routed product surface, not a placeholder landing page. - API reference pages should render inside the product shell or a shell-aligned embedded frame by default, not as a separate browser-window flow. - External doc destinations may still exist, but they are secondary and should be clearly marked as leaving the product experience. - Downloads, guides, API reference, examples, and app-developer onboarding should feel like one connected developer area.

4.6 Admin experience

Rules: - Admin pages must follow workflow-family grouping, not a flat entity dump. - Admin landing pages start with summary and action-required context before raw tables. - Lifecycle, IAM, config, evidence, and finance pages may each have different density, but they must still follow the shared table/detail/task contracts. - The admin-specific workflow model in Admin_Ops_and_User_IA_v1.md is normative for admin route refactors.

4.7 App shell extensibility

Rules: - App-specific UI extends the shared product shell rather than replacing it. - App flows inherit the same page header, toolbar, workbench, detail, task, and navigation contracts as core product pages unless a documented extension point allows a controlled variance. - App-related developer surfaces should connect cleanly to docs, downloads, API references, service accounts, access credentials, and project context. - Placeholder developer/app routes are product gaps and should be tracked as missing product surfaces, not left as acceptable steady-state behavior. - App workload pages must follow the same state/filter/result behavior as core allocation pages. A failed or stopped app should move into a predictable status bucket, not disappear because the current page only renders active groups with no status-switch affordance.

5. Async UX Rules

Provisioning lifecycle UI

  • After create allocation, UI immediately shows requested state row/card.
  • Poll GET /api/v1/allocations/{allocation_id} until terminal state or user leaves page.
  • Show explicit transitions: requested -> provisioning -> active.
  • If the backend exposes a task identifier, route the user to /tasks/:task_id while also surfacing the in-flight workload/allocation in workbench views.
  • On failed, render failure_reason prominently with retry path.
  • Card state spec:
  • requested: spinner + text "Request accepted, waiting for worker".
  • provisioning: spinner + text "Configuring node and SSH access".
  • active: green status chip + full action row enabled.
  • failed: red status chip + failure_reason + retry CTA.
  • Long-running hint: if allocation remains requested or provisioning for 90 seconds, show non-blocking "Taking longer than expected" notice with support hint.
  • Navigation behavior: provisioning cards must persist across route changes and page reload via server state polling.

Release lifecycle UI

  • Release action returns accepted async state.
  • UI updates to releasing immediately and continues polling.
  • Final states: released or failed (with reason and admin/support hint).
  • release_failed UX is mandatory:
  • Card status chip release_failed with clear message from release_failed_reason.
  • Show explicit billing note: "Billing has stopped for this allocation."
  • Show retry action (POST /api/v1/allocations/{allocation_id}/release) and admin escalation guidance.

Terminal UX

  • UI requests terminal token immediately before opening WS.
  • Token is single-use and short-lived; failed connect requires remint.
  • UI handles TerminalControl frames (session_ready, session_error, session_closed) in addition to terminal stream output.
  • Disconnect states: retry (remint token) and close.
  • Allocation card action row layout is fixed: Metrics / Console / Key / Release icon buttons in one row.
  • Active allocation views must show copy-ready SSH command:
  • ssh <username_on_node>@<host> -p <port> -i <downloaded_key_path>
  • Key action should route to user key management (public keys) and SSH command help.

Notification UX

  • In-app notification payload requires: id, occurred_at, severity, type, title, message.
  • Optional action_url maps to deep links (billing page, allocation detail, admin view).
  • Notification shell requirements:
  • top-nav bell icon with unread badge.
  • panel lists latest notifications with severity color and timestamp.
  • clicking an item follows action_url when present.
  • notifications persist in panel until dismissed/expired by retention policy.

Billing + Payments UX

  • Low balance must be persistent, not toast-only:
  • top-level banner on all authenticated pages when below threshold.
  • billing page balance card changes state (ok/warning/depleted).
  • Stripe return handling must be explicit:
  • billing route (or /billing/return) handles return query params and renders success/cancel/failure status.
  • on return, refresh balance and usage queries.
  • Provision modal must show cost estimate before confirmation:
  • <node_count> x <gpus_per_node> x <price_per_gpu_per_hour> = <estimated_hourly_total>.

Admin Dashboard UX

  • Admin overview auto-refresh every 5 seconds by default.
  • Show "last updated" relative time and expose pause/resume auto-refresh toggle.
  • Admin ops must expose contextual runbook links for degraded states using stable runbook_id mappings.
  • Runbook panel design and metadata flow follow doc/operations/Ops_Runbook_Architecture.md.

6. Accessibility Baseline

  • Keyboard-only navigation across all interactive controls.
  • Modal focus trap and escape behavior.
  • aria-label for icon-only actions.
  • Error messages announced via accessible live region.
  • Contrast meets WCAG AA for body text and actionable controls.

7. Responsive Baseline

  • Breakpoints: mobile, tablet, desktop.
  • Data tables provide mobile-safe condensed/card fallback.
  • Critical actions remain reachable without hover interactions.
  • Theme behavior follows the redesign token system and product direction; avoid page-specific theme overrides that break cross-page consistency.

8. UX Completion Checklist (Pre-coding Gate)

  • Screen inventory finalized with ownership.
  • Low-fidelity screen mocks completed under doc/product/ux-mocks/.
  • Contract traceability complete for all critical actions.
  • State matrix implemented in UI component spec.
  • Async lifecycle UX documented and reviewed.
  • Accessibility baseline accepted.
  • Responsive baseline accepted.

9. Shared UX Foundation Packages (Build Before Feature Screens)

API and Session Layer

  • packages/web/src/lib/api/client.ts: typed contract client wrapper, auth header handling, correlation-id forwarding.
  • packages/web/src/lib/api/errors.ts: map backend ErrorResponse -> UX-safe messages/actions.
  • packages/web/src/lib/session/*: user/session/role state and protected route guard helpers.
  • packages/web/src/lib/api/rateLimit.ts: shared 429 handling that reads Retry-After, exposes countdown state, and standard retry helpers.

Query and Data Fetching Layer

  • packages/web/src/lib/query/keys.ts: stable query key conventions.
  • packages/web/src/lib/query/defaults.ts: retry/stale/caching defaults by data class.
  • packages/web/src/lib/query/pagination.ts: cursor/page_size helpers.

Shared UI System Layer

  • packages/web/src/components/system/LoadingState.tsx
  • packages/web/src/components/system/EmptyState.tsx
  • packages/web/src/components/system/ErrorState.tsx
  • packages/web/src/components/system/RestrictedState.tsx
  • packages/web/src/components/system/RateLimitedState.tsx
  • packages/web/src/components/system/ConfirmActionModal.tsx
  • packages/web/src/components/system/AlertModal.tsx (severity variants: info/success/warning/error)
  • packages/web/src/components/system/DeletionModal.tsx (destructive confirm with optional reason input)
  • packages/web/src/components/system/CursorTable.tsx
  • packages/web/src/components/system/NotificationBell.tsx
  • packages/web/src/components/system/NotificationPanel.tsx
  • packages/web/src/components/product/PageHeader.tsx
  • packages/web/src/components/product/WorkbenchPage.tsx
  • packages/web/src/components/product/DetailPage.tsx
  • packages/web/src/components/product/DataTable.tsx
  • packages/web/src/components/product/MoreActions.tsx
  • packages/web/src/components/product/StatusChip.tsx
  • packages/web/src/components/product/Timeline.tsx

Accessibility Utilities

  • packages/web/src/components/a11y/FocusTrap.tsx
  • packages/web/src/components/a11y/LiveRegion.tsx
  • packages/web/src/components/a11y/useKeyboardShortcut.ts

Styling and Tokens

  • packages/web/src/styles/tokens.css (or equivalent token source)
  • packages/web/src/styles/theme.css

10. Vertical Slice UX Delivery Order (With API)

  1. Auth + Profile slice
  2. UX: tabbed sign-in (Work account vs Personal account), redirect/exchange/logout, protected layout, expired-session UX.
  3. API: auth endpoints + GET /users/me.
  4. Include first-run onboarding redirect when balance is zero and no allocations exist.

  5. Control Room + Catalog + Workloads read slice

  6. UX: dashboard, SKU discovery, operational workload listing/detail.
  7. API: GET /skus, GET /nodes, allocation read endpoints.
  8. Include explicit sold-out SKU card state (Unavailable, disabled provision CTA).

  9. Launch + Tasks + Provision/Release + Terminal slice

  10. UX: stepped launch wizard, provisioning task progression, release progression, terminal connect/reconnect states.
  11. API: allocation create/release + terminal token + terminal WS.

  12. Billing + Payments slice

  13. UX: balance/usage, CSV export, checkout/portal redirect outcomes.
  14. API: billing + payments endpoints.

  15. Admin slice

  16. UX: users/nodes/allocations/audit admin pages with filters/actions.
  17. API: admin endpoints.

  18. Storage slice

  19. UX: file explorer CRUD and safety/confirmation patterns.
  20. API: storage endpoints.

Delivery rule: - For multi-service screens, implement panel-level degraded states and progressive enablement instead of blocking the whole page.