Skip to content

Audit & compliance

Implemented

Source: packages/services/admin/ · doc/architecture/Audit_Presentation_Model_v1.md · scripts/ci/audit_*.sh · doc/architecture/Encryption_Envelope_Spec.md · doc/architecture/Partitioning_and_Retention_Strategy.md

GPUaaS records an immutable audit row for every privileged mutation, with allowlisted metadata and per-tenant partitioning. This page covers what's audited, how the immutability is enforced at multiple layers, the metadata allowlist, the API surface, retention, and the compliance posture.

What gets audited

mindmap
  root((Audited<br/>actions))
    User mutations
      user.create
      user.balance.adjust
      user.disable
      user.delete
    Allocation
      allocation.create
      allocation.release
      allocation.force_release
      allocation.restart
    Node
      node.create
      node.delete
      node.drain
      node.slot.approve
      node.slot.disable
    Payments
      refund.create
      payment.reconcile
      payment.session.cancel
    Auth
      service_account.create
      service_account.token.revoke
      session.revoke
    Policy
      policy.update
      policy.bound_violation
    Storage
      storage.namespace.create
      storage.object.delete (privileged path)
    Privilege denials
      authz.deny
      rate_limit.block

Every action above writes a row to audit_logs in the same transaction as the domain change.

Row shape

erDiagram
    audit_logs {
        uuid id PK
        uuid actor_user_id "claim sub or service_account.id"
        text actor_role "user|admin|service_account|system"
        text action "allocation.force_release"
        text target_type "allocation|user|node|policy|..."
        text target_id "canonical id"
        text result "success|failure"
        text correlation_id "uuid"
        jsonb metadata "allowlisted keys only"
        timestamp created_at
        text actor_ip "may be null for system actions"
        text user_agent
    }

Hard column properties (no updated_at, no deleted_at):

-- per-service Postgres role grants:
-- GPUaaS audit table:
--   SELECT, INSERT  → svc_admin
--   No UPDATE, no DELETE for any role

Immutability — multi-layer

flowchart LR
    classDef block fill:#f8d7da,stroke:#42101e
    classDef ok fill:#d1e7dd,stroke:#0a3622

    UPDATE[UPDATE audit_logs] --> L1{Application<br/>code review}
    L1 -- caught --> B1[PR blocked]:::ok
    L1 -- missed --> L2{"ORM layer<br/>repo functions only<br/>expose Insert"}
    L2 -- caught --> B2[Compile/runtime error]:::ok
    L2 -- missed --> L3{DB grants}
    L3 -- caught --> B3[Postgres deny]:::ok
    L3 -- missed --> L4{Replica/<br/>partitioned table is read-only<br/>for older partitions}
    L4 --> B4[Old partitions read-only<br/>via permissions]:::ok

    DELETE[DELETE audit_logs] -.same layers.-> L1

Four layers (any one of which would block the attempt):

  1. Code review + CI gates (scripts/ci/audit_mandatory_guard.sh, audit_presence_guard.sh)
  2. ORM/repo layer — only Insert is exposed
  3. Postgres grants — no UPDATE / DELETE for any application role
  4. Partition table permissions — historical partitions are read-only

Metadata allowlist

audit_logs.metadata jsonb has an explicit allowlist. Unknown keys rejected at write time:

Allowed key Purpose
reason Operator-supplied reason
policy_key When action changes a policy value
old_value Before-image (policy / balance / status)
new_value After-image
status_from Lifecycle transitions
status_to Lifecycle transitions
error_code When result=failure
request_scope Resolved scope (tenant/project)
idempotency_key_hash Hashed key, not raw
provider_ref External id (Stripe payment id, etc.)
allocation_id Cross-reference
node_id Cross-reference

Forbidden (never appears in audit_logs.metadata):

  • Raw tokens (access / refresh / id)
  • Raw credentials (passwords, API keys)
  • SSH private or public key material
  • Full request / response payload dumps
  • Direct payment instrument data (PAN, CVV)
  • End-user PII beyond stable IDs

Audit flow per mutation

sequenceDiagram
    autonumber
    participant U as User / Admin
    participant API as Handler
    participant SVC as Service
    participant DB as Postgres
    participant SAN as middleware.Sanitize

    U->>API: privileged mutation request
    API->>SAN: scrub PII/credentials from log line
    SAN-->>API: sanitized
    API->>SVC: domain call
    SVC->>DB: BEGIN
    SVC->>DB: domain mutation
    SVC->>DB: INSERT audit_logs<br/>(actor, action, target, result, correlation_id, metadata)
    SVC->>DB: INSERT outbox row
    SVC->>DB: COMMIT
    SVC-->>API: outcome
    API-->>U: response with correlation_id

    Note over API,DB: If COMMIT fails, NOTHING happened —<br/>audit + domain + outbox all rolled back.<br/>If COMMIT succeeds, ALL three durable.

This transactional triple-write is the single most important property of the audit subsystem.

CI enforcement

flowchart TB
    PR[PR opened] --> GATE1[audit_mandatory_guard.sh]
    GATE1 --> CHK1{Privileged handler<br/>writes audit_logs<br/>in same tx?}
    CHK1 -- no --> BLOCK1[Block PR]
    CHK1 -- yes --> GATE2[audit_presence_guard.sh]
    GATE2 --> CHK2{Integration test<br/>asserts audit row?}
    CHK2 -- no --> BLOCK2[Block PR]
    CHK2 -- yes --> OK([gates pass])

    classDef block fill:#f8d7da,stroke:#42101e
    classDef ok fill:#d1e7dd,stroke:#0a3622
    class BLOCK1,BLOCK2 block
    class OK ok

Acceptance matrix (PRD)

AT Check
AT-080 Privileged mutations (provision, release, refund, admin node ops) each produce a structured audit entry
AT-081 Audit log entries contain actor_user_id, actor_role, action, target_type, target_id, result, correlation_id
AT-082 Failed authorization attempts recorded with result=failure
AT-083 Audit log entries immutable — no update/delete path exposed

API surface

Endpoint Auth Purpose
GET /api/v1/admin/audit-logs admin Paginated list with filter: actor, action, target_type, target_id, from, to, result, correlation_id
GET /api/v1/admin/audit-logs/{id} admin Single entry
GET /api/v1/admin/audit-logs.csv admin CSV export
GET /api/v1/admin/audit-logs/by-correlation/{cid} admin All rows sharing a correlation id (incident reconstruction)

PRD §FR-11: admin can query and export audit logs for compliance and incident response.

Retention model

flowchart LR
    classDef active fill:#d1e7dd,stroke:#0a3622
    classDef hot fill:#fff3e0,stroke:#e65100
    classDef cold fill:#e3f2fd,stroke:#1565c0
    classDef archived fill:#eceff1,stroke:#455a64

    M0[Current month<br/>audit_logs_y2026m05]:::active
    M1[Last month]:::hot
    M2[Up to 12 months ago]:::hot
    M3[12-24 months]:::cold
    M4[Archived to object storage<br/>read-only restore path]:::archived

    M0 --> M1 --> M2 --> M3 --> M4

Partitioning model (see Partitioning_and_Retention_Strategy.md):

  • audit_logs — partitioned by month, retained long-term to meet compliance.
  • usage_records — partitioned by month, medium retention.
  • ledger_entries — partitioned by year, retained indefinitely (compliance).
  • node_tasks — short retention; older than 30 days archived.

Compliance posture

Property How
Immutable financial ledger ledger_entries never UPDATE/DELETE
Immutable audit audit_logs never UPDATE/DELETE
Right-to-erasure (GDPR-style) Tenant-level deletion + audit retention exemption — design captured in assumptions register, not implemented in MVP
Data residency region_code first-class on resources; canonical resource identifier carries region
Encryption at rest KMS-backed envelope encryption — Encryption_Envelope_Spec.md
Encryption in transit TLS everywhere; mTLS internal
Access reviews Project + tenant memberships have auditable lifecycle (grant + revoke + soft-delete)
Separation of duties Admin actions auditable; payment refunds require dedicated API (not generic balance adjustment)
Webhook integrity Stripe signature on raw body; dedupe by event_id; AT-053

Encryption envelope (high level)

flowchart LR
    DATA[Field value<br/>e.g. SSH key, payment ref] --> WRAP[wrap with DEK<br/>data encryption key]
    WRAP --> CIPH[ciphertext + IV + tag]
    CIPH --> STORE[("Stored in DB:<br/>ciphertext + key_version")]
    DEK[Per-record DEK] --> WRAP
    KEK[KEK in KMS] -.wraps DEKs.-> DEK
    KEK -.rotation.-> KMS[(KMS)]

    classDef secret fill:#fff3e0,stroke:#e65100
    class DEK,KEK,KMS secret

→ Detail: Encryption_Envelope_Spec.md. Key rotation runbook: Key Rotation and Compromise Response.

Operator queries (common patterns)

-- Who force-released which allocations in the last 24h?
SELECT created_at, actor_user_id, target_id, metadata->>'reason'
FROM audit_logs
WHERE action = 'allocation.force_release'
  AND created_at > now() - interval '24 hours'
ORDER BY created_at DESC;

-- All actions in one incident
SELECT created_at, actor_user_id, action, target_type, target_id, result
FROM audit_logs
WHERE correlation_id = $1
ORDER BY created_at;

-- Failed authorization attempts on financial routes (per AT-082)
SELECT created_at, actor_user_id, action, metadata->>'error_code'
FROM audit_logs
WHERE result = 'failure'
  AND target_type IN ('payment_session', 'refund_record', 'ledger_entry')
  AND created_at > now() - interval '7 days';

Where to look next