UX Design Principles v1¶

Purpose: - Define the cross-cutting UX principles that all new user and admin surfaces must satisfy. - Provide an evaluation lens that is simpler and more durable than route-by-route gap lists. - Keep future screen work aligned with the actual product posture of a GPU cloud and platform operations console.

Inputs: - doc/product/PRD.md - doc/product/UX_Journeys.md - doc/product/UX_Implementation_Spec.md - doc/product/Admin_Ops_and_User_IA_v1.md

1. Decision-First UX¶

Before a user or operator commits to a costly, slow, or destructive action, the UI must show: - what will happen - what it will cost - how long it is likely to take - what risk is involved - how recovery works if it fails

This applies to: - provisioning GPUs - releasing allocations - decommissioning or reimaging nodes - billing deposits and depletion warnings - manual remediation and bootstrap flows

Decision-first UX means: - do not ask users to commit before they understand the implication - do not hide cost or depletion impact behind a later screen - do not bury recovery semantics in timeline details or documentation

Required examples: - provisioning confirmation should show hourly burn and balance-to-runtime context - low-balance warnings should show projected time-to-depletion when available - destructive infra actions should explain the next valid recovery path

2. Operational Truth Over Generic UI¶

The UI must reflect real platform state, not optimistic shorthand.

Users and operators need to know: - current truth - current attempt - current risk - next valid action - proof/source of truth

This means: - distinguish current attempt from prior attempts - distinguish accepted action from completed action - distinguish websocket connected from terminal ready - distinguish billing stopped from release complete - distinguish UI convenience labels from backend workflow truth

Operational truth requires: - explicit status semantics - failure reasons that identify the owning layer - current-state summaries that do not blur historical attempts into active state - audit/troubleshooting affordances that preserve correlation identifiers and traceability

Anti-patterns: - fake reversibility for infrastructure actions - optimistic "connected" or "completed" states without proof - status badges without next action or source-of-truth context - hiding real failure states behind generic retry copy

3. One Workflow Per Intent¶

Primary product flows should optimize for user or operator intent, not for backend entity boundaries.

Examples of intent: - get my first GPU - provision more capacity - connect to my active allocation - recover a broken node - investigate a platform incident - onboard a new MAAS machine

The user should not need to mentally stitch together multiple entity pages just to complete one core intent.

This does not require every intent to have its own route. Intent can be expressed by: - contextual guidance on existing pages - action-focused panels or steppers - intent-specific summaries on top of entity detail - deep links that preserve the current task

Preferred approach: - make existing routes smarter about current context before creating parallel surfaces - use dedicated intent-specific pages only when the flow cannot be expressed cleanly inside current information architecture

User intent and admin/operator intent must be explicitly separated in navigation and page framing.

A platform admin may act in two distinct modes: - using the product as a user - operating the platform as an administrator

Those modes: - have different priorities - have different risk profiles - should not depend on the same navigation assumptions

Mode-aware navigation can be expressed by: - explicit Admin sections with clear framing - visible admin context or mode indicators - admin pages that optimize for triage and action routing, not end-user task completion

This principle exists to prevent: - sidebar trees that mix unrelated user and operator workflows - admin dashboards that are informative but not actionable - context-switch confusion when the same person uses both roles

5. Cost and Time Anxiety Must Be Reduced¶

GPU cloud UX must reduce uncertainty about: - cost - remaining runtime - time to completion - time to forced release or disruption

This is more important than generic polish.

The product should not rely on: - hidden burn rates - generic low-balance banners - opaque provisioning waits - unexplained delays in infrastructure workflows

Instead, the UI should prefer: - cost estimates before action - projected runtime from current balance when available - "taking longer than expected" messaging with recovery guidance - visible state transitions for long-running workflows

For this product, "paywall" is not the core problem. Lack of context before payment is the problem.

6. Admin Ops Is a Router, Not a Destination¶

Admin ops surfaces should route operators to the next action.

Admin pages should: - prioritize degraded or blocked states - rank signals by severity - deep-link to filtered drilldown views - preserve recent incident context

They should not behave as: - static metric walls - disconnected summaries with no next step - dashboards that require separate mental correlation just to begin investigation

Required operator outcomes: - see what matters first - understand why it matters - reach the owning detail surface in one click

7. Manual and Assisted Recovery Must Reuse Trusted Paths¶

Operator-assisted flows should reuse trusted delivery and verification paths instead of inventing side channels.

Examples: - brokered bootstrap path should serve both first-time manual onboarding and manual remediation - recovery should prefer existing trust and artifact paths before introducing new transport mechanisms

This principle prevents: - fragmented recovery UX - multiple operator stories for the same underlying install/update path - security posture drift driven by convenience fixes

8. Anti-Patterns¶

Do not ship UX that depends on: - fake undo for irreversible infrastructure operations - placeholder navigation without maturity labeling - "just add funds" onboarding without cost/risk explanation - entity-first flows that leave primary intents split across multiple pages - success states that are accepted-but-not-complete without making that distinction visible

9. Review Questions¶

Every meaningful screen or flow change should be reviewed against these questions:

Does the user understand cost, time, and recovery before committing?
Does the screen show current operational truth, not just a generic status label?
Does the flow support a real user/operator intent without cross-page stitching?
Is the navigation framing appropriate for user mode vs admin mode?
Does this reduce anxiety or increase it?
Does this reuse an existing trusted path, or is it inventing a new one for convenience?

10. How To Use This Doc¶

Use this document as a review lens for:
new screens
major page rewrites
admin triage surfaces
provisioning/billing/recovery flows
Reference this document from:
doc/product/UX_Implementation_Spec.md
future UX mocks and implementation specs
Treat these principles as constraints, not inspiration-only guidance.