User Onboarding and Auth Context Runbook¶
Trigger Conditions¶
Use this runbook when any onboarding/auth context symptom appears:
- Signup/login succeeds but user is blocked on project-scoped APIs.
- API returns
invalid_requestwith project-context-required message. - API returns membership/ownership-denied during first-run onboarding.
- User sees partial session state (
tenant/projectmissing in shell).
Impact and Blast Radius¶
- New users cannot provision/use resources.
- Existing users may lose access after membership/context drift.
- Support load increases due to repeated auth retries without deterministic root cause.
Fast Triage (Correlation-First)¶
- Capture
correlation_idand errorcodefrom user-visible error envelope. - Confirm request had required project context:
- header
X-Project-IDfor project-owned operations. - Run log lookups by
correlation_id: - API:
{service="gpuaas-api"} | json | correlation_id="<CORRELATION_ID>" - Auth/gateway paths where relevant.
- Resolve user scope records in DB:
users.org_id- active
tenant_memberships - active
project_memberships - Confirm onboarding bootstrap artifacts exist:
- personal/default project (for self-signup path)
user_posix_identitiesrow (runtime identity continuity).
Diagnostic Checklist¶
- Signup path:
POST /api/v1/auth/personal/signupreturns201- response includes
org_idand user context. - Admin-created user path:
- user has tenant membership + default project membership
user.createaudit log exists with samecorrelation_id.- Project-scoped requests:
- missing/invalid
X-Project-IDmust return deterministic400 invalid_request. - Membership state:
- only active rows (
deleted_at is null) are used for authz decisions.
Recovery Actions¶
- Missing personal bootstrap:
- re-run controlled bootstrap for affected user (tenant + default project + memberships).
- Missing project membership:
- insert/restore correct
project_membershipsrow. - Invalid client context propagation:
- fix header propagation path and redeploy affected client/backend component.
- Re-test with same user journey and confirm resolved via
correlation_id.
Required Incident Evidence¶
- User-facing error envelope (
code,message,correlation_id). - API log excerpts filtered by
correlation_id. - Membership state snapshot before/after mitigation.
- Audit evidence for any privileged corrective mutation.
Owning Teams¶
- Primary: Platform/API
- Secondary: Auth/Identity UX