Runbook: App Catalog Browse and Entitlement Incident
Trigger
- User reports App Catalog page failing to load.
- UI shows generic error with
correlation_id.
- App cards render but "Deploy" is disabled and support needs operator response.
Scope
- Page:
/apps/catalog (browse/filter only).
- This runbook does not cover app instance deployment runtime.
- Deploy action is intentionally disabled ("coming soon") in Phase 1.
- Capture user-facing error envelope:
code
message
correlation_id
- Confirm request context:
- authenticated session exists
X-Project-ID present (if missing, route to tenant/project authz runbook)
- Reproduce with
dev-user and same project context.
Correlation-First Query Flow
- API logs:
{service="gpuaas-api"} | json | correlation_id="<CORRELATION_ID>"
- Filter to app catalog endpoints:
GET /api/v1/apps/catalog
GET /api/v1/projects/{project_id}/apps/entitlements
- If mutation path involved (entitlement toggle/admin publish/deprecate), pivot to audit evidence and actor scope.
Expected Error Classes
invalid_request: malformed filter params or missing required context.
insufficient_permissions: actor lacks admin/project role for entitlement mutation paths.
service_unavailable / upstream_error: dependency failure (db/cache/internal service).
internal_error: unexpected server-side failure.
Operator Messaging Baseline
- For deploy CTA:
- respond with: "Deploy is coming soon; catalog browse/filter and entitlement controls are active."
- do not classify as outage when button is disabled by design.
- For true failures:
- include
correlation_id in incident ticket and handoff notes.
- include affected project/tenant context.
Escalation
- Authz/context errors:
doc/operations/runbooks/Tenant_Project_Authorization_Runbook.md
- API failure spike:
doc/operations/runbooks/API_Degradation_Runbook.md
- IAM entitlement assignment issues:
doc/operations/runbooks/IAM_Role_Assignment_and_Membership_Incident_Runbook.md