Skip to content

NATS Secret Generation

glass-ui needs a handful of NATS credentials before its pods can start: signing nkeys, account passwords, and JWT robot tokens. Rather than make you pre-create every one of these, the chart can mint them in-cluster with short-lived generator Jobs. This page is a deep dive into how those Jobs work, why they look different under plain Helm versus ArgoCD, and the one sync-time gotcha that can leave pods stuck waiting for a secret that never appears.

If you only want to deploy glass through ArgoCD, start with Deploying via ArgoCD; this page explains the machinery underneath it.

What the generators produce

Three families of secrets are generated, each by its own small image:

Values group Generator Example secrets
natsNKeys nkey-generator glass-nats-user-nkey, glass-jwt-mint-nkey
natsAuth plain-password-generator glass-nats-user-auth, glass-nats-policy-auth
natsJwtAuth jwt-mint glass-jwt-mint-auth, glass-center-middleware-auth, …

The templates live in helm/glass-ui/templates/nats/helpers/ (_nkeySecret.tpl, _passwordSecret.tpl, _jwtSecret.tpl), wrapped by the resources under helm/glass-ui/templates/nats/secrets/. Each generated secret ships with its own ServiceAccount and a RoleBinding to the shared glass-key-generator Role, which grants get, create, and update on secrets in the release namespace (the Role is namespace-scoped, not restricted to a specific secret name).

Generate vs. static

Every entry chooses one of two rendering paths, decided entirely by your values.

  • Static — if you supply the material directly (publicKey/seed for an nkey, password for an account, token for a JWT) the chart emits a plain Secret with your values. No Job, no RBAC. Use this when a credential must be shared across components or pre-provisioned by an external system.
  • Generate — if the material is empty and generate: true, the chart emits the ServiceAccount + RoleBinding + generator Job. The Job mints the credential and writes the Secret.

Two properties of the generator Job matter for everything below:

  • --overwrite=false (the per-entry overwrite field's default) — the Job creates the Secret only if it is missing. It never rotates an existing credential, so re-running it is a safe no-op. (Set overwrite: true only if you explicitly want re-runs to rotate the credential.)
  • ttlSecondsAfterFinished: 60 — a completed Job deletes itself after a minute, so finished generators do not pile up.

overwrite:false means generation is idempotent, not rotational

Because the Job never overwrites, you cannot rotate a NATS credential by re-running the generator (at the default overwrite: false). Delete the Secret first — the next sync (under ArgoCD) or next helm upgrade (under plain Helm) recreates it — or switch that entry to a static value.

Here is a values file that shows the generate path for all three families (natsNKeys, natsAuth, natsJwtAuth) and the static path for an nkey, plus the ArgoCD switch discussed in the next section. Note the dependency called out in the JWT block: the jwt-mint generator signs its token with the glass-jwt-mint-nkey secret, so that nkey secret must exist first.

YAML
# Example: NATS secret generation in glass-ui
#
# glass-ui needs several NATS credentials at start-up: nkeys (signing key
# pairs), plain passwords, and JWT robot tokens. The chart can either GENERATE
# them in-cluster (via short-lived generator Jobs) or consume STATIC values you
# supply. This file shows both modes plus the single switch that adapts the
# generators to an ArgoCD deployment.

# --- The ArgoCD switch -------------------------------------------------------
# When true, the generator Jobs are emitted as ArgoCD "Sync" hooks instead of
# plain Jobs. Leave it false (the default) for plain Helm / Helmfile installs.
global:
  argocd:
    enabled: true

# The generator binaries ship in the chart's images. Override their registry
# per-image under `images.<imageName>.registry` (see the chart's values.yaml for
# the full per-image schema, e.g. images.nkeyGenerator.registry); omitted here
# to keep the focus on the generate-vs-static behaviour.

# --- nkeys: generated vs. static ---------------------------------------------
# Each entry mints a NATS nkey (a public key + seed pair) into `secretName`.
natsNKeys:
  # GENERATED: no publicKey/seed supplied + generate:true -> the chart emits a
  # ServiceAccount + RoleBinding + generator Job that creates the secret.
  # overwrite:false means the Job is a no-op once the secret exists, so it
  # never rotates the key on resync/upgrade.
  jwtMint:
    enabled: true
    secretName: jwt-mint-nkey
    publicKey: ""
    seed: ""
    generate: true
    overwrite: false
  # STATIC: supply a known key pair and the chart emits a plain Secret directly
  # (no Job, no RBAC). Use this when the key must be shared or pre-provisioned.
  userAuth:
    enabled: true
    secretName: nats-user-nkey
    publicKey: "UABC...EXAMPLE_PUBLIC_KEY"
    seed: "SUAB...EXAMPLE_SEED"
    generate: false
    overwrite: false

# --- passwords: generated ----------------------------------------------------
# Plain NATS account passwords. Same generate/overwrite semantics as nkeys.
natsAuth:
  userAuth:
    enabled: true
    secretName: nats-user-auth
    password: ""
    generate: true
    overwrite: false
  policy:
    enabled: true
    secretName: nats-policy-auth
    password: ""
    generate: true
    overwrite: false

# --- JWT robot tokens: generated ---------------------------------------------
# Same generate/overwrite semantics. NOTE: the jwt-mint generator signs the
# token with the jwt-mint nkey, so it depends on the `glass-jwt-mint-nkey`
# secret (from jwtMint above) already existing — supply `token` for a static
# value instead.
natsJwtAuth:
  jwtMint:
    enabled: true
    secretName: jwt-mint-auth
    claims: ["robot_type:service"]
    token: ""
    generate: true
    overwrite: false
    tokenType: "robot"
    robotName: "jwt-mint"

Helm vs. ArgoCD: the same Jobs, applied differently

The generator Jobs are identical in both deployment models. What changes is how the Job is delivered — and that is controlled by a single switch, global.argocd.enabled.

Plain Helm / Helmfile (global.argocd.enabled: false)

The Jobs are ordinary batch/v1 Jobs. helm install (or upgrade) applies them alongside everything else; they run once and seed the Secrets. On later upgrades the Jobs are applied again, but --overwrite=false makes them no-ops, so the original credentials survive untouched.

ArgoCD (global.argocd.enabled: true)

The very same Jobs gain two annotations:

YAML
argocd.argoproj.io/hook: Sync
argocd.argoproj.io/hook-delete-policy: BeforeHookCreation

This is not cosmetic — it solves a real problem. A generator Job sets ttlSecondsAfterFinished: 60, so it disappears a minute after completing. If ArgoCD tracked it as a normal resource, that disappearance would read as a missing resource → OutOfSync drift on every refresh. Marking the Job a Sync hook tells ArgoCD two things at once:

  1. Exclude it from the live-vs-desired comparison — the self-deleting Job no longer causes drift.
  2. Re-run it on every sync — a Sync hook executes each time you sync, so a sync after a secret goes missing regenerates it. Combined with --overwrite=false, this makes the deployment self-healing: present secrets are left alone, missing ones are recreated.

BeforeHookCreation deletes the previous hook instance just before a new sync creates it, so a leftover Job inside its 60-second TTL window is cleaned up first. The Job name is suffixed with the Helm release revision (…-generator-r{{ .Release.Revision }}); under ArgoCD's helm template rendering the revision is always 1, so the name is stable (…-r1).

The \"Helm release secret not found\" warning is expected under ArgoCD

ArgoCD renders charts with helm template, so there is no sh.helm.release.v1.* release Secret in the cluster. The generator logs Helm release secret not found, omitting owner reference and creates the Secret without an owner reference. This is harmless — the Secret is still created correctly.

The sync-strategy gotcha

This is the part worth internalizing. ArgoCD has two sync strategies, and they treat hooks differently:

  • hook strategy (the default for argocd app sync) — runs hooks.
  • apply strategy — performs a bare kubectl apply and sets skipHooks: true. The generator hooks never run.

So a normal sync seeds and heals the secrets, but a sync forced with the apply strategy silently skips them. Nothing fails loudly: the application reports the non-hook resources as applied, while the generator Jobs simply never appear. The symptom shows up one layer down — consumer pods never reach Ready. On kubectl get pods they show as ContainerCreating (the glass-nats StatefulSet, whose secret volume mount fails) or stuck in their initWaitForSecrets init containers — Init:0/2 for auth-users (it waits on two secrets) and Init:0/1 for policy (one secret), each init container named wait-for-secret-N. All are in the Pending phase, waiting on a Secret that was never created:

Text Only
MountVolume.SetUp failed for volume "auth-issuer" :
  secret "glass-nats-user-nkey" not found

If you inspect the application controller during such a sync, the skipped-hooks decision is explicit in the log:

Text Only
"msg":"Syncing", "skipHooks":true,
SyncStrategy{Apply:{Force:false}, Hook:nil}

Do not sync glass-ui with the apply strategy

Triggering a sync with the apply strategy — for example argocd app sync glass-ui --strategy=apply, or patching the Application's operation.sync.syncStrategy.apply directly — skips the generator hooks. On a fresh install or after a secret is deleted, this leaves the consuming pods stuck Pending. Use the default hook strategy.

Recovering missing secrets

If glass secrets go missing — someone deleted them, or a clean-slate reinstall removed them — the fix is a single ordinary sync:

Bash
argocd app sync glass-ui

The Sync hooks re-run and recreate only the missing secrets (existing ones are left untouched by --overwrite=false). The consuming pods then pick up the restored secrets and become Ready.

Avoid firing syncs back-to-back

Two syncs in quick succession can race the in-flight Jobs: BeforeHookCreation deletes a generator Job that the previous sync is still running, and the 60-second TTL may remove a just-finished Job before the next sync observes it. Let one sync settle before starting another; the state converges on a clean resync.

A brief Pending is normal

Even on a healthy sync there is a short window where consumer pods are not yet Ready. The generator hooks and the consuming workloads are applied together (the hooks carry no argocd.argoproj.io/sync-wave annotation, so they share the same wave as everything else with no enforced ordering), so the glass-nats StatefulSet may start before its Secret exists. The auth-users and policy deployments handle this explicitly with initWaitForSecrets / waitForSecretVolumes, which block until the Secrets land. A few seconds — or minutes, on a slow image pull — of ContainerCreating / Init:0/N during bootstrap is expected and self-resolves once the generators finish.

What's next