Skip to content

User Onboarding and Auth Context Runbook

Trigger Conditions

Use this runbook when any onboarding/auth context symptom appears:

  1. Signup/login succeeds but user is blocked on project-scoped APIs.
  2. API returns invalid_request with project-context-required message.
  3. API returns membership/ownership-denied during first-run onboarding.
  4. User sees partial session state (tenant/project missing in shell).

Impact and Blast Radius

  1. New users cannot provision/use resources.
  2. Existing users may lose access after membership/context drift.
  3. Support load increases due to repeated auth retries without deterministic root cause.

Fast Triage (Correlation-First)

  1. Capture correlation_id and error code from user-visible error envelope.
  2. Confirm request had required project context:
  3. header X-Project-ID for project-owned operations.
  4. Run log lookups by correlation_id:
  5. API: {service="gpuaas-api"} | json | correlation_id="<CORRELATION_ID>"
  6. Auth/gateway paths where relevant.
  7. Resolve user scope records in DB:
  8. users.org_id
  9. active tenant_memberships
  10. active project_memberships
  11. Confirm onboarding bootstrap artifacts exist:
  12. personal/default project (for self-signup path)
  13. user_posix_identities row (runtime identity continuity).

Diagnostic Checklist

  1. Signup path:
  2. POST /api/v1/auth/personal/signup returns 201
  3. response includes org_id and user context.
  4. Admin-created user path:
  5. user has tenant membership + default project membership
  6. user.create audit log exists with same correlation_id.
  7. Project-scoped requests:
  8. missing/invalid X-Project-ID must return deterministic 400 invalid_request.
  9. Membership state:
  10. only active rows (deleted_at is null) are used for authz decisions.

Recovery Actions

  1. Missing personal bootstrap:
  2. re-run controlled bootstrap for affected user (tenant + default project + memberships).
  3. Missing project membership:
  4. insert/restore correct project_memberships row.
  5. Invalid client context propagation:
  6. fix header propagation path and redeploy affected client/backend component.
  7. Re-test with same user journey and confirm resolved via correlation_id.

Required Incident Evidence

  1. User-facing error envelope (code, message, correlation_id).
  2. API log excerpts filtered by correlation_id.
  3. Membership state snapshot before/after mitigation.
  4. Audit evidence for any privileged corrective mutation.

Owning Teams

  1. Primary: Platform/API
  2. Secondary: Auth/Identity UX