Skip to content

Brokered Identity Linking and Dedup v1

Purpose

Define how OIDC brokered identities, including Hugging Face via Keycloak, map to GPUaaS users. The goal is identity continuity: a returning login must resolve to the same logical user unless there is an explicit conflict that requires user or admin action.

Current Model

GPUaaS persists one primary OIDC identity on users:

  • oidc_issuer
  • oidc_subject
  • username

For normal OIDC logins through the platform Keycloak realm, the persisted issuer is the public Keycloak issuer. For tenant-owned enterprise OIDC providers, the persisted issuer is the tenant provider issuer.

The normal login path is:

  1. Exchange the authorization code for a provider token.
  2. Verify token claims.
  3. Upsert by (oidc_issuer, oidc_subject).
  4. If this is a new user, create the personal tenant/project bootstrap context.

Problem

Some brokered social providers can cause Keycloak to return a different realm sub across fresh-session testing if the brokered account is not linked inside Keycloak as expected. When that happens, strict (issuer, subject) upsert treats the login as a new identity. If the same preferred_username is returned, the insert hits the users.username unique key and the old code created a fallback username such as <name>_<subprefix>, producing duplicate accounts.

Decision

The auth service may rebind a username conflict only when all of these are true:

  • The existing user is already OIDC-backed.
  • The existing user has the same persisted issuer.
  • The existing user has no password hash, so local/dev password users are not silently linked.
  • The new login presents the same username and a different subject.

This is deliberately narrow. It repairs brokered-sub churn for the same platform issuer without allowing a social login to claim an existing local password account or an account from another issuer.

Expected Behavior

First login:

  • If (issuer, subject) does not exist and username is free, create the user.
  • Bootstrap the personal tenant/project context.

Returning login with stable subject:

  • Match by (issuer, subject).
  • Update presentation fields such as username/role as needed.
  • Do not create a new user.

Returning brokered login with changed Keycloak subject but same username:

  • If the username belongs to an existing OIDC-only user from the same issuer, rebind that row to the new subject.
  • Do not create a fallback username duplicate.

Conflict:

  • If the username belongs to a password/local user, another issuer, or a user with ambiguous state, do not auto-link.
  • Fall back to the normal conflict handling path. Future Account UI should expose an explicit "this is me" linking flow for this case.

Enterprise tenant federation:

  • Tenant-bound OIDC/SAML remains membership-gated.
  • If an enterprise identity exists but has no tenant membership, return the existing membership-required error rather than creating a personal account.

Linked Identity Table

The account-linking baseline adds user_identity_links as the durable identity map. The legacy users.oidc_issuer and users.oidc_subject columns remain for session/JWT compatibility during the v3 migration, but new account surfaces should read linked identities from user_identity_links.

Each active link records:

  • provider kind and issuer,
  • provider subject,
  • verified email or external username where available,
  • primary/secondary marker,
  • link/unlink audit records,
  • conflict review status.

That table backs the v3 Account profile duplicate-resolution UI. The narrow rebind rule above remains the safe local repair for repeated Hugging Face brokered login duplicates, but explicit multi-provider linking uses this table instead of overwriting the primary users row.

Explicit Linking Rules

GPUaaS does not auto-link by email alone. Email is a hint for conflict presentation, not proof of ownership.

Allowed proofs for creating an additional active identity link:

  1. current_session - the currently authenticated user initiated the provider-link flow and completed the callback state.
  2. admin_approval - a platform or tenant admin approved a recovery/link request after out-of-band verification.
  3. invite - a tenant/project invite is redeemed by the same provider callback state.

Denied or deferred cases:

  1. If (provider_name, issuer, subject) is already linked to the same user, return an already-linked outcome without creating a duplicate row.
  2. If (provider_name, issuer, subject) is linked to a different user, reject the link and require admin recovery.
  3. If a candidate has the same email as another user but no proof from the list above, create no link and return a duplicate-email conflict.
  4. If a provider does not return a stable subject, reject the link.

Audit Requirements

Every link decision that changes state must write an audit row:

  1. auth.identity_link.create for successful links.
  2. auth.identity_link.revoke for unlink/revoke.
  3. auth.identity_link.conflict for admin-visible duplicate/provider conflicts.

Audit metadata may include provider name, provider type, issuer, email hash/posture, and proof type. It must not include raw OAuth tokens, refresh tokens, provider access tokens, or provider secret material.