Configuration¶

The individual configuration statements are documented in the example YAML configuration file, which is reproduced at the end of this chapter.

The configuration consists of the following general sections:

Instance name
Storage backend for snapshots
LMDBs to sync
Sync parameter tweaking
Monitoring and logging

Lightning Stream allows environment variables in the YAML configuration files, e.g.:

instance: ${LS_INSTANCE}

Instance name¶

Warning

Every instance MUST have a unique instance name.

Every instance MUST have a unique instance name. This instance name is included in the snapshot filenames and used to ignore its own snapshots.

If you accidentally configure two instances with the same name, the following will happen:

They will not see each other's changes, unless a third instance happens to include them in a snapshot.
Other instances may not see all the changes from both instances, because they will only load the most recent snapshot for this instance name.

The instance name can either be configured in the config file:

instance: unique-instance-name-here

Or it can be passed using the --instance commandline flag, but then you must be careful to always pass it.

As mentioned above, environment variables can be used in the YAML configuration, which can be useful for the instance name:

instance: ${LS_INSTANCE}

The instance name should be composed of safe characters, like ASCII letters, numbers, dashes and dots. It MUST NOT contain underscores, slashes, spaces or other special characters. Basically, what is allowed in a hostname is safe.

Storage¶

Lightning Stream uses our Simpleblob library to support different storage backends. At the moment of writing, it supports S3 and local filesystem backends.

S3 backend¶

This is currently the only backend that makes sense for a production environment. It stores snapshots in an S3 or compatible storage. We have tested it against Amazon AWS S3 and MinIO servers.

MinIO example for testing without TLS:

storage:
  type: s3
  options:
    access_key: minioadmin
    secret_key: minioadmin
    region: us-east-1
    bucket: lightningstream
    endpoint_url: http://localhost:9000

Currently available options:

Option	Type	Summary
access_key	string	S3 access key
secret_key	string	S3 secret key
region	string	S3 region (default: "us-east-1")
bucket	string	Name of S3 bucket
create_bucket	bool	Create bucket if it does not exist
global_prefix	string	Transparently apply a global prefix to all names before storage
prefix_folders	bool	Show folders in list instead of recursively listing them
endpoint_url	string	Use a custom endpoint URL, e.g. for Minio
tls	tlsconfig.Config	TLS configuration
init_timeout	duration	Time allowed for initialisation (default: "20s")
use_update_marker	bool	Reduce LIST commands, see link below
update_marker_force_list_interval	duration	See link below for details

The use_update_marker option can reduce your AWS S3 bill in small personal deployments without compromises on update latency, as GET operations are 10 times cheaper than LIST operations, but it cannot reliably be used when you are using a bucket mirror mechanism to keep multiple buckets in sync.

You can find all the available S3 options with full descriptions in Simpleblob's S3 backend Options struct.

Filesystem backend¶

For local testing, it can be convenient to store all snapshots in a local directory instead of an S3 bucket:

storage:
  type: fs
  options:
    root_path: /tmp/snapshots

LMDBs¶

The lmdbs section configures which LMDB databases to sync. One Lightning Stream instance can sync more than one LMDB database. Snapshots are independent per LMDB.

Every database requires a name that must not change over time, as it is included in the snapshot filenames. The name should only contain lowercase letters and must not contains spaces, underscores or special characters. If you are bad at naming things and have only one database, "main" is a good safe choice.

A basic example for an LMDB with a native schema:

lmdbs:
  main:
    path: /path/to/lmdb/dir
    options:
      create: true
      map_size: 1GB
    schema_tracks_changes: true  # native schema

The path option is the file path to the LMDB directory, or file if options.no_subdir is true.

Some commonly use LMDB options:

no_subdir: the LMDB does not use a directory, but a plain file (required for PowerDNS).
create: create the LMDB if it does not exist yet.
map_size: the LMDB map size, which is the maximum size the LMDB can grow to.

The schema_tracks_changes indicates if the LMDB supported the native Lightning Stream schema.

If the LMDB does not support a native schema, you can use a configuration like this:

lmdbs:
  main:
    path: /path/to/lmdb/dir
    options:
      create: true
      map_size: 1GB
    schema_tracks_changes: false  # non-native schema
    dupsort_hack: true            # set this if the non-native LMDB uses MDB_DUPSORT

Do read the section on schemas to evaluate if the used LMDB schema is safe for syncing.

Sync parameters¶

There are some top-level sync parameters that you may want to tweak for specific deployments, but these do have sensible defaults, so you probably do not need to.

The ones you are more likely to want to change are:

storage_poll_interval: how often to list the storage to check for new snapshots (default: 1s)
storage_force_snapshot_interval: how often to force a snapshot when there are no changes (default: 4h)
lmdb_poll_interval: how often to check the LMDB for changes (default: 1s). This check is very fast by itself, but you may want to adjust it if you want to limit the rate at which an instance can generate snapshots when there are frequent updates.

More can be found with descriptions in the example configuration.

Monitoring and logging¶

Example of a few logging and monitoring options:

# HTTP server with status page, Prometheus metrics and /healthz endpoint.
# Disabled by default.
http:
  address: ":8500"    # listen on port 8500 on all interfaces

# Logging configuration
log:
   level: info        # "debug", "info", "warning", "error", "fatal"
   format: human      # "human", "logfmt", "json"
   timestamp: short   # "short", "disable", "full"

health:
   # ... see example config

Example config with comments¶

This example configuration assumes a PowerDNS Authoritative server setup with native schemas, but it explains every available option.

# This is a Lightning Stream (LS) example configuration for use with the
# the PowerDNS Authoritative DNS server (PDNS Auth) LMDB backend.
# This example configuration is based on version 4.8 of PDNS Auth, which
# uses a native LS compatible schema. This version has not been released yet as
# of Feb 2023.
# This example aims to document all available options. If an option is
# commented out, its default value is shown, unless indicated otherwise.

# Every instance of LS requires a unique instance name. This instance name
# is included in the snapshot filenames and used by instances to discover
# snapshots by other instances.
# LS supports the expansion of OS environment variables in YAML configs, with
# values like ${INSTANCE}, which can simplify the management of multiple
# instances.
instance: unique-instance-name

# Check the LMDBs for newly written transactions at this interval, and write
# a new snapshot if anything has changed. The check is very cheap.
#lmdb_poll_interval: 1s

# Periodically log LMDB statistics.
# Useful when investigating issues based on logs. Defaults to 30m.
lmdb_log_stats_interval: 5m
# Include LMDB memory usage statistics from /proc/$PID/smaps for metrics.
# This can be expensive on some older kernel versions when a lot of memory
# is mapped.
#lmdb_scrape_smaps: true

# Check the storage for new snapshots at this interval
#storage_poll_interval: 1s

# When a download or upload fails, these items determine how often and at what
# interval we will retry, before existing LS with an error.
#storage_retry_interval: 5s
#storage_retry_count: 100
#storage_retry_forever: false

# Force a snapshot once in a while, even if there were no local changes, so
# that this instance will not be seen as stale, or removed by external cleaning
# actions.
# Note that external cleaning mechanisms are not recommended, it is safer to use
# the 'storage.cleanup' section.
#storage_force_snapshot_interval: 4h

# MemoryDownloadedSnapshots defines how many downloaded compressed snapshots
# we are allowed to keep in memory for each database (minimum: 1, default: 3).
# Setting this higher allows us to keep downloading snapshots for different
# instances, even if one download is experiencing a hiccup.
# These will transition to 'memory_decompressed_snapshots' once a slot opens
# up in there.
# Increasing this can speed up processing at the cost of memory.
#memory_downloaded_snapshots: 3

# MemoryDecompressedSnapshots defines how many decompressed snapshots
# we are allowed to keep in memory for each database (minimum: 1, default: 2).
# Keep in mind that decompressed snapshots are typically 3-10x larger than
# the downloaded compressed snapshots.
# Increasing this can speed up processing at the cost of memory.
#memory_decompressed_snapshots: 2

# Run a single merge cycle and then exit.
# Equivalent to the --only-once flag.
#only_once: false

# The 'lmdbs' section defines which LMDB database need to be synced. LS will
# start one internal syncer instance per database.
# The keys in this section ('main' and 'shard' here) are arbitrary names
# assigned to these databases. These names are used in logging and in the
# snapshot filenames, so they must match between instances and not be changed
# later.
lmdbs:
  # In PDNS Auth, this database contains all the data, except for the records.
  main:
    # Path to the LMDB database. This is typically the directory containing
    # a 'data.mdb' and 'lock.mdb' file, but PDNS Auth uses the 'no_subdir'
    # option, in which case this is a path to the data file itself.
    path: /path/to/pdns.lmdb

    # LMDB environment options
    options:
      # If set, the LMDB path refers to a file, not a directory.
      # This is required for PDNS Auth.
      no_subdir: true

      # Create the LMDB if it does not exist yet.
      create: true

      # Optional directory mask when creating a new LMDB. 0 means default.
      #dir_mask: 0775
      # Optional file mask when creating a new LMDB. 0 means default.
      #file_mask: 0664

      # The LMDB mapsize when creating the LMDB. This is the amount of memory
      # that can be used for LMDB data pages and limits the file size of an
      # LMDB. Keep in mind that an LMDB file can eventually grow to its mapsize.
      # A value of 0 means 1GB when creating a new LMDB.
      #map_size: 1GB

      # The maximum number of named DBIs within the LMDB. 0 means default.
      #max_dbs: 64

    # This indicates that the application natively supports LS headers on all
    # its database values. PDNS Auth supports this starting from version 4.8.
    # Earlier versions required this to be set to 'false'.
    # Application requirements include, but are not limited to:
    # - Every value is prefixed with an 24+ byte LS header.
    # - Deleted entries are recorded with the same timestamp and a Deleted flag.
    # When enabled, a shadow database is no longer needed to merge snapshots, and
    # conflict resolution is both more accurate and more efficient.
    schema_tracks_changes: true

    # Older versions of PDNS Auth (4.7) require this to be enabled to handle
    # the used MDB_DUPSORT DBIs. Never versions have a native LS schema.
    # Not compatible with schema_tracks_changes=true.
    #dupsort_hack: false

    # (DO NOT USE) For development only: force an extra padding block in the
    # header to test if the application handles this correctly.
    #header_extra_padding_block: false

    # This allows setting options per-DBI.
    # Currently, the only option supported is 'override_create_flags', which is
    # should only be used when you need both options.create=true
    # and have snapshots created by a pre-0.3.0 version of LS. Newer snapshots
    # have all the information they need to create new DBIs.
    dbi_options: {}
      # Example use to create new LMDBs from old snapshots of older PDNS Auth
      # 4.7 LMDBs. This is not be needed for any new deployment with PDNS Auth
      # 4.8.
      #pdns:
      #  override_create_flags: 0
      #domains:
      #  override_create_flags: MDB_INTEGERKEY
      #domains_0:
      #  override_create_flags: MDB_DUPSORT|MDB_DUPFIXED
      #keydata:
      #  override_create_flags: MDB_INTEGERKEY
      #keydata_0:
      #  override_create_flags: MDB_DUPSORT|MDB_DUPFIXED
      #metadata:
      #  override_create_flags: MDB_INTEGERKEY
      #metadata_0:
      #  override_create_flags: MDB_DUPSORT|MDB_DUPFIXED
      #tsig:
      #  override_create_flags: MDB_INTEGERKEY
      #tsig_0:
      #  override_create_flags: MDB_DUPSORT|MDB_DUPFIXED

  # In PDNS Auth, this database contains all the records.
  # The various options available are the same as in the 'lmdb.main' section above.
  shard:
    path: /path/to/pdns.lmdb-0
    options:
      no_subdir: true
      create: true

    # Example use to create new LMDBs from old snapshots of older PDNS Auth
    # 4.7 LMDBs. This is not be needed for any new deployment with PDNS Auth
    # 4.8.
    #dbi_options:
    #  records:
    #    override_create_flags: 0

# Storage configures where LS stores its snapshots
storage:
  # For the available backend types and options, please
  # check https://github.com/PowerDNS/simpleblob

  # Example with S3 storage in Minio running on localhost.
  # For the available S3 backend options, check the Options struct in
  # https://github.com/PowerDNS/simpleblob/blob/main/backends/s3/s3.go#L43
  type: s3
  options:
    access_key: minioadmin
    secret_key: minioadmin
    region: us-east-1
    bucket: lightningstream
    endpoint_url: http://localhost:9000

  # Example with local file storage for local testing and development
  #type: fs
  #options:
  #  root_path: /path/to/snapshots

  # Periodic snapshot cleanup. This cleans old snapshots from all instances,
  # including stale ones. Multiple instances can safely try to clean the same
  # snapshots at the same time.
  # LS only really needs the latest snapshot of an instance, but we want to keep
  # older snapshots for a short interval in case a slower instance is still
  # trying to download it.
  # This is disabled by default, but highly recommended.
  cleanup:
    # Enable the cleaner
    enabled: true
    # Interval to check if snapshots need to be cleaned. Some perturbation
    # is added to this interval so that multiple instances started at exactly
    # the same time do not always try to clean the same snapshots at the same
    # time.
    interval: 5m
    # Snapshots must have been available for at least this interval before they
    # are considered for cleaning, so that slower instances have a chance to
    # download them.
    must_keep_interval: 10m
    # Remove stale instances without newer snapshots after this interval, but
    # only after we are sure this instance has downloaded and merged that
    # snapshot, and subsequently written a new snapshots that incorporates these
    # changes.
    remove_old_instances_interval: 168h   # 1 week

# HTTP server with status page, Prometheus metrics and /healthz endpoint.
# Disabled by default.
http:
  address: ":8500"    # listen on port 8500 on all interfaces

# Logging configuration
# LS uses https://github.com/sirupsen/logrus internally
log:
   level: info        # "debug", "info", "warning", "error", "fatal"
   format: human      # "human", "logfmt", "json"
   timestamp: short   # "short", "disable", "full"

# Health checkers for /healthz endpoint
# This is always enabled. This section allows tweaking the intervals.
#health:
  # Check if we can list the storage buckets
  #storage_list:
  #  interval: 5s             # Check every 5 seconds
  #  warn_duration: 1m0s      # Trigger a warning after 1 minute of failures
  #  error_duration: 5m0s     # Trigger an error after 5 minutes of failures
  #
  # Check if we successfully load snapshots
  #storage_load:
  #  interval: 5s
  #  warn_duration: 1m0s
  #  error_duration: 5m0s
  #
  # Check if we successfully store snapshots
  #storage_store:
  #  interval: 5s
  #  warn_duration: 1m0s
  #  error_duration: 5m0s
  #
  # Check if we started up and are ready to handle real traffic according to
  # some checks, like having loaded all available snapshots.
  #start:
  #  interval: 1s
  #  warn_duration: 1m0s
  #  error_duration: 5m0s
  #  # If true, a failing startup sequence will be included in the healthz
  #  # overall status. This can be used to prevent marking the node ready
  #  # before Lightning Stream has completed an initial sync.
  #  report_healthz: false
  #  # Controls if the healthz 'startup_[db name]' metadata field will be used
  #  # to report the status of the startup sequence for each db.
  #  report_metadata: true