NAME

dstore-dist-top-reporter - daemon to generate reports based on protobuf messages from recursor and dnsdist

SYNOPSIS

dstore-dist-top-reporter [-config file] [-debug] [-trace]

DESCRIPTION

dstore-dist-top-reporter generates reports from the protobuf messages that are generated by recursor and dnsdist. It is configured using a YAML-based configuration file.

dstore-dist-top-reporter is configured with a set of streams, which consist of all the possible message inputs for reports. Each stream can have different sampling rates applied upstream, and so it is important to configure the stream with the correct sampling rate so that the statistics can be correctly calculated.

It is also configured with a set of reports; each report is based on a stream and reports differ based on the key that is used to generate the report, as well as the frequency that the report is generated and the number of “top-N” entries that will be output in the report.

An important concept for summary reports (i.e. when the output of many reports will be summarized over a much longer time period) is “oversampling”, i.e. specifying a much higher number for “N” than would be needed in the final report; this ensures that the summarized report is much more accurate than it would be without oversampling.

Finally, the storage backends for reports can be configured - reports can be sent to HTTP and/or Elasticsearch backends. By default reports are sent to all configured backends, but storage backends can be configured to only accept specific reports.

OPTIONS

-config file: Load configuration from file

-debug: Generate debug logging

-trace: Generate trace logging

-help: Display a helpful message and exit.

CONFIGURATION FILE FORMAT

The following YAML fields are supported for configuration:

  • streams: An array, with each element representing a different input stream, that can be processed differently. Each stream is associated with a different listen address, and with potentially different sampling rates.
  • name: The name of the stream used as a key for generating reports (this must not contain spaces)

  • title: A descriptive title for the stream

  • upstream_sampling: The sampling rate used by the upstream. Use 1 if no upstream sampling is done.

  • address: The address (optional) and port to listen on for this stream

  • tlsconfig: This field is required to enable TLS. The following fields are used to configure TLS. A certificate and key must be specified.

    • insecure_skip_verify: Controls whether a client verifies the server’s certificate chain and hostname. Defaults to false.
    • ca_file: Optional CA file to use (PEM).
    • ca: Optional CA to use specified as a string in PEM format.
    • add_system_ca_pool: Adds the system CA pool if private CAs are enabled, when set. Defaults to false.
    • cert_file: Optional certificate file to use (PEM).
    • cert: Optional certificate to use specified as a string in PEM format.
    • key_file: Optional key file to use (PEM).
    • key: Optional key to use specified as a string in PEM format.
    • watch_certs: If true, enables background reloading of certificate files. Defaults to false.
    • watch_certs_poll_interval: If watch_certs is true, how often to check for changes. Defaults to 5 seconds.
streams:
- name: all-queries
  title: "All traffic (sampled)"
  address: ":4801"
  tlsconfig:
    cert_file: /etc/dstore-dist/mycert.crt
    key_file: /etc/dstore-dist/private.key
  upstream_sampling: 100
- name: malware
  title: "Queries tagged as malware"
  address: ":4802"
  upstream_sampling: 10
  • reports: An array, with each element representing a different report that will be generated. Each report can use a different stream as input, and use a different field as a key.

    • name: The name of the report (This must not contain any spaces).

    • field: The field to use as the key for the report. field can take one of the following values:

      • qname: The lowercanse DNS question name
      • qname/raw: The raw qname, not converted to lowercase
      • qname/suffix: The public suffix of the qname (e.g. .com, .co.uk, etc.)
      • qname/suffix+1: The public suffix plus one label (e.g. example.com, example.co.uk, etc.)
      • qname/tld: The TLD (e.g. com, uk, etc.)
      • requestorid: The subscriber’s username
      • ip/<prefix32>/<prefix64>/ - The IP address of the client, with the IP address aggregated to the v4/v6 prefix specified. For example ip/32/128 would perform no aggregation of v4 or v6 IPs.
    • n: The reports generated are all of the “Top-N” format, and so this specifies N, for example 100 for Top-100. However, oversampling is recommended, particularly when aggregating multiple Top-N reports generated over a short time-period into a single report covering a longer time period. For example to gather enough statistics for a “Top-100” report that is gathered in 30 second intervals into a 24-hour report, you should specify at least 1000 for N.

    • interval: How often to generate the report. Reports are saved in memory until they are generated, so if a long interval means that RAM usage may be large, and also data may be lost if the server crashes between reports.

    • stream: The name of the stream to use as input for the report.

reports:
- name: all-domains
  field: qname
  n: 100
  stream: all-queries
  interval: 30s
- name: all-users
  field: requestorid
  n: 100
  stream: all-queries
  interval: 30s
- name: malware-domains
  field: qname
  n: 1000
  stream: malware
  interval: 30s
- name: all-ips
  field: ip/32/64
  n: 1000
  interval: 30s
  stream: all-queries
  • storage: An array, which specifies different storage backends for reports.
    • name: The name of the storage backend (cannot contain spaces).
    • backend: The type of backend, one of “http” or “elastic”.
    • reports: An array of report names. If omitted, all reports are sent to this backend.
    • skip_empty: Skip generating a report if there are no entries
    • method: Applies to HTTP backend only, can be POST (default) or PUT.
    • url: For HTTP this is the exact URL, for elastic this is the base URL of the elastic instance. For this and the elastic templates, the following template variables are available: TimestampISO, TimestampUnix, TimestampDate, ReportName
    • elastic_index_template: Applies to elastic backend only, specifies the index template. Defaults to the report name.
    • elastic_id_template: Applies to elastic backend only, specifies a template for the elastic IDs. If empty, IDs will be generated automatically.
    • elastic_single_doc: Applies to elastic backend only, specifies that all entries for a report should be placed in a single document instead of the default of a document per entry.
    • username - The HTTP basic auth username
    • password - The HTTP basic auth password
    • headers - A map containing custom HTTP headers
    • tlsconfig - Specifies TLS options, as per the tlsconfig map above.
    • retry_max - Maximum number of retries. By default no retries are attempted.
storage:
  - name: elasticsearch
    backend: elastic
    skip_empty: true
    url: http://127.0.0.1:9200/
    elastic_index_template: "{{.ReportName}}-{{.TimestampDate}}"
  - name: http
    backend: http
    url: https://example.com:8080/storage
    username: foo
    password: bar
    tlsconfig:
      ca_file: /etc/dstore_dist/ca_file.pem
    reports:
    - malware_domains
  • http - The HTTP config field, used to configure the address to listen on for metrics and the status page for viewing/downloading reports
    • address: The address (optional) and port to listen on
http:
  address: ":8701"