Skip to content

Sanitize-first rules

Implemented

Source: packages/shared/middleware/sanitize.go · doc/governance/Coding_Standards.md §Log and Trace Sanitization

Sensitive and PII fields must be redacted before they reach any log sink or trace backend. This applies equally to structured logs, OTel trace attributes, and span events.

Sanitize-first is one of the platform's seven hard rules (see Governance precedence). This page describes what it covers, why each control exists, and how to add new redaction targets.

The boundary

flowchart LR
    classDef raw fill:#ffebee,stroke:#c62828
    classDef san fill:#fff3e0,stroke:#e65100
    classDef sink fill:#e8eaf6,stroke:#3949ab

    REQ[Request body / claims / span attrs<br/>RAW values]:::raw --> SAN[middleware.Sanitize<br/>walks fields,<br/>replaces blocklisted values<br/>with REDACTED]:::san
    SAN --> LOG[(slog → stdout)]:::sink
    SAN --> TR[(OTel trace exporter)]:::sink
    SAN --> AUD[(audit_logs)]:::sink

    REQ -.never directly.-> LOG
    REQ -.never directly.-> TR

Mandatory: Every internal service passes requests through a sanitization layer before logging or creating trace spans. This is not optional for production services.

The blocklist

Fields that must never appear in logs or traces in plaintext:

Field Source category Why
password, password_hash credential Long-term identity proof
access_token, refresh_token, id_token auth tokens Active session proof
ssh_private_key, ssh_private_key_enc key material Standing access
stripe_customer_id, payment_reference payment identity PII + linkage
email (high-volume paths) PII Per-tenant correlation risk
username (high-volume paths as identifier) PII Same
access_secret_enc credential storage Ciphertext but unnecessary leakage
scheduler_metadata fields with creds mixed Conservative redaction

Redaction format: replace value with [REDACTED] — never omit, so log structure stays parseable for debugging.

Sanitize lookup logic

flowchart TB
    F[Field name encountered<br/>in a struct walk] --> NORM[normalize: lower, trim]
    NORM --> CHK1{in blocklist?}
    CHK1 -- yes --> RED["replace value with '[REDACTED]'"]
    CHK1 -- no --> CHK2{nested struct or map?}
    CHK2 -- yes --> WALK[recurse into nested fields]
    CHK2 -- no --> KEEP[pass through]
    WALK --> CHK1

    classDef red fill:#ffebee,stroke:#c62828
    classDef ok fill:#d1e7dd,stroke:#0a3622
    class RED red
    class KEEP,WALK ok

The blocklist is matched case-insensitively against normalized field names. The function works recursively over nested structs/maps so a credential buried in metadata.user.password is still caught.

Pattern: logging

import "github.com/.../packages/shared/middleware"

func (h *Handler) CreateUser(w http.ResponseWriter, r *http.Request) {
    var in CreateUserRequest
    _ = json.NewDecoder(r.Body).Decode(&in)

    // SANITIZE before logging — the original `in` keeps the real password
    // for the service call; only the sanitized copy is logged.
    logSafe := middleware.Sanitize(in)
    h.log.InfoContext(r.Context(), "create user request", "request", logSafe)

    // Service call uses the ORIGINAL un-sanitized value:
    user, err := h.svc.Create(r.Context(), in)
    ...
}

Pattern: OTel span attributes

// NOT OK — attribute carries the secret to the trace exporter
span.SetAttributes(attribute.String("user.password", req.Password))

// OK — never set the attribute in the first place
// If you must record the field name for diagnostics:
sanitized := middleware.SanitizeMap(map[string]any{"password": req.Password})
span.SetAttributes(attribute.String("password", sanitized["password"].(string)))
// → attribute value is "[REDACTED]"

For sensitive identifier-like values that you need some correlation on, hash them:

span.SetAttributes(attribute.String("idempotency_key_hash", sha256Short(idemKey)))

Audit metadata is also allowlisted

The blocklist guards log/trace exporters. Audit has a stricter rule: only known good keys are allowed.

Surface Rule Mechanism
Logs (slog) Blocklist — known-bad keys redacted middleware.Sanitize before emit
Traces (OTel) Blocklist — known-bad keys redacted Same
Audit metadata jsonb Allowlist — only known-good keys accepted Validated at INSERT, unknown keys rejected

→ See Audit & compliance for the audit allowlist.

CI enforcement

flowchart TB
    PR[PR opened] --> G1[observability_trace_gate.sh]
    G1 --> C1{Every binary calls<br/>middleware.SetupOTel?}
    C1 -- no --> X1[Block PR]
    C1 -- yes --> C2{Every HTTP server wraps<br/>middleware.Tracing +<br/>middleware.CorrelationID?}
    C2 -- no --> X1
    C2 -- yes --> C3{Every async consumer<br/>creates processing span<br/>with required attributes?}
    C3 -- no --> X1
    C3 -- yes --> OK([gate passes])

    PR --> G2[Code review]
    G2 -.check.-> C4{Sanitize call present<br/>before log/trace emit?}
    C4 -- missing --> X2[Reviewer requests change]
    C4 -- present --> OK

    classDef ok fill:#d1e7dd,stroke:#0a3622
    classDef block fill:#f8d7da,stroke:#42101e
    class OK ok
    class X1,X2 block

Sanitize-call presence is currently reviewer-enforced; a static analysis rule that flags missing sanitize before a log/trace call is on the watchlist.

When you genuinely need a sensitive value for debugging

Don't log it. Choose one of:

flowchart TB
    NEED[Need to debug<br/>with sensitive value] --> CHOICE{What kind?}
    CHOICE -- need to correlate --> HASH[Use a short hash<br/>sha256 first 8 chars]
    CHOICE -- count occurrences --> COUNT[Log count / length only]
    CHOICE -- type validation --> TYPE[Log field type, not value]
    CHOICE -- shape check --> SHAPE["Log redacted scaffold:<br/>password: REDACTED, name: X"]
    CHOICE -- privileged action --> AUD[Write audit_logs metadata<br/>with allowlisted keys]
    classDef ok fill:#d1e7dd,stroke:#0a3622
    class HASH,COUNT,TYPE,SHAPE,AUD ok

Examples a reviewer will block:

// VIOLATION — token in URL
log.Info("calling auth", "url", fmt.Sprintf("/api/v1/foo?token=%s", token))

// VIOLATION — credentials in error
return fmt.Errorf("bad creds for user %s pw %s", username, password)

// VIOLATION — full request body in span
span.SetAttributes(attribute.String("request_body", string(rawBody)))

// VIOLATION — PII in high-volume path
log.Info("request", "email", req.Email, "method", "GET", "path", "/healthz")

// VIOLATION — unsanitized struct dump
log.Info("user created", "user", user)  // user includes password_hash

Fixes:

// OK
log.Info("calling auth", "url", "/api/v1/foo")

// OK
return fmt.Errorf("bad creds for user %s: %w", username, ErrInvalidPassword)

// OK
span.SetAttributes(attribute.Int("request_body_bytes", len(rawBody)))

// OK — drop PII; correlate by user_id instead
log.Info("request", "user_id", claims.Sub, "method", "GET", "path", "/healthz")

// OK
log.Info("user created", "user", middleware.Sanitize(user))

How to add a new redaction target

flowchart LR
    A[Identify new sensitive<br/>field name] --> B[Add to BLOCKLIST<br/>in packages/shared/middleware/sanitize.go]
    B --> C[Add unit test asserting<br/>field redacted]
    C --> D[Open PR<br/>reviewer + security owner]
    D --> E[Merge → field auto-redacted everywhere]

Single point of change: one blocklist constant in packages/shared/middleware/sanitize.go. Every binary picks it up on next build.

Where to look next