Previous topic

Security of PowerDNS

Next topic

DNSSEC

This Page

Performance and Tuning

In general, best performance is achieved on recent Linux kernels with the bindbackend, or if something more database-like is preferred, the LMDB backend. Meanwhile many of the largest PowerDNS installations are based on PostgreSQL or MySQL.

Database servers can require configuration to achieve decent performance. It is especially worth noting that several vendors ship PostgreSQL with a slow default configuration.

Warning

When deploying (large scale) IPv6, please be aware some Linux distributions leave IPv6 routing cache tables at very small default values. Please check and if necessary raise sysctl net.ipv6.route.max_size.

Packet Cache

PowerDNS by default uses the ‘Packet Cache’ to recognise identical questions and supply them with identical answers, without any further processing. The default time to live is 20 seconds and can be changed by setting cache-ttl. It has been observed that the utility of the packet cache increases with the load on your nameserver.

Not all backends may benefit from the packet cache. If your backend is memory based and does not lead to context switches, the packet cache may actually hurt performance.

Query Cache

Besides entire packets, PowerDNS can also cache individual backend queries. Each DNS query leads to a number of backend queries, the most obvious additional backend query is the check for a possible CNAME. So, when a query comes in for the ‘A’ record for ‘www.powerdns.com’, PowerDNS must first check for a CNAME for ‘www.powerdns.com’.

The Query Cache caches these backend queries, many of which are quite repetitive. The maximum number of entries in the cache is controlled by the max-cache-entries setting. Before 4.1 this setting also controls the maximum number of entries in the packet cache.

Most gain is made from caching negative entries, ie, queries that have no answer. As these take little memory to store and are typically not a real problem in terms of speed-of-propagation, the default TTL for negative queries is a rather high 60 seconds.

This only is a problem when first doing a query for a record, adding it, and immediately doing a query for that record again. It may then take up to 60 seconds to appear. Changes to existing records however do not fall under the negative query ttl (negquery-cache-ttl), but under the generic query-cache-ttl which defaults to 20 seconds.

The default values should work fine for many sites. When tuning, keep in mind that the Query Cache mostly saves database access but that the Packet Cache also saves a lot of CPU because zero internal processing is done when answering a question from the Packet Cache.

Caches & Memory Allocations & glibc

Managing the two caches described above involves a lot of memory management, that is handled by malloc in your libc. To avoid contention between threads, the allocator in glibc separates memory into separate arenas, sometimes even hundreds of them. This avoids locking, but it may cause massive memory fragmentation, that could make PowerDNS take an order of magnitude more memory in some situations.

If you suspect this is happening on your setup, you can consider lowering MALLOC_ARENA_MAX to a small number. Several users have reported that 4 works well for them. Via systemctl edit pdns you can put Environment=MALLOC_ARENA_MAX=4 in your pdns unit file to enable this tweak.

Note that newer glibc versions replace MALLOC_ARENA_MAX with a different setting syntax. The new syntax is GLIBC_TUNABLES=glibc.malloc.arena_max=4, please check which syntax is valid for your glibc version (it is quite likely that both syntaxes will work).

Performance Monitoring

A number of counters and variables are set during PowerDNS Authoritative Server operation.

Counters

All counters that show the “number of X” count since the last startup of the daemon.

corrupt-packets

Number of corrupt packets received

deferred-cache-inserts

Number of cache inserts that were deferred because of maintenance

deferred-cache-lookup

Number of cache lookups that were deferred because of maintenance

deferred-packetcache-inserts

Number of packet cache inserts that were deferred because of maintenance

deferred-packetcache-lookup

Number of packet cache lookups that were deferred because of maintenance

dnsupdate-answers

Number of DNS update packets successfully answered

dnsupdate-changes

Total number of changes to records from DNS update

dnsupdate-queries

Number of DNS update packets received

dnsupdate-refused

Number of DNS update packets that were refused

incoming-notifications

Number of NOTIFY packets that were received

key-cache-size

Number of entries in the key cache

latency

Average number of microseconds a packet spends within PowerDNS

meta-cache-size

Number of entries in the metadata cache

open-tcp-connections

Number of currently open TCP connections

overload-drops

Number of questions dropped because backends overloaded (backends are overloaded if they have more outstanding queries than the value of overload-queue-length)

packetcache-hit

Number of packets which were answered out of the cache

packetcache-miss

Number of times a packet could not be answered out of the cache

packetcache-size

Amount of packets in the packetcache

qsize-q

Number of packets waiting for database attention, only available if distributor-threads > 1

query-cache-hit

Number of hits on the Query Cache

query-cache-miss

Number of misses on the Query Cache

query-cache-size

Number of entries in the query cache

rd-queries

Number of packets sent by clients requesting recursion (regardless of if we’ll be providing them with recursion).

receive-latency

Average number of microseconds needed to receive a query

recursing-answers

Number of packets we supplied an answer to after recursive processing

recursing-questions

Number of packets we performed recursive processing for.

recursion-unanswered

Number of packets we sent to our recursor, but did not get a timely answer for.

security-status

Security status based on Security Polling.

send-latency

Average number of microseconds needed to send the answer

servfail-packets

Amount of packets that could not be answered due to database problems

signature-cache-size

Number of entries in the signature cache

signatures

Number of DNSSEC signatures created

sys-msec

Number of CPU milliseconds sent in system time

tcp-answers-bytes

Total number of answer bytes sent over TCP

tcp-answers

Number of answers sent out over TCP

tcp-queries

Number of questions received over TCP

tcp4-answers-bytes

Total number of answer bytes sent over TCPv4

tcp4-answers

Number of answers sent out over TCPv4

tcp4-queries

Number of questions received over TCPv4

tcp6-answers-bytes

Total number of answer bytes sent over TCPv6

tcp6-answers

Number of answers sent out over TCPv6

tcp6-queries

Number of questions received over TCPv6

timedout-packets

Amount of packets that were dropped because they had to wait too long internally

udp-answers-bytes

Total number of answer bytes sent over UDP

udp-answers

Number of answers sent out over UDP

udp-do-queries

Number of queries received with the DO (DNSSEC OK) bit set

udp-in-csum-errors

Number of UDP packets received with an invalid checksum

udp-in-errors

Number of packets received faster than the OS could process them

udp-noport-errors

Number of UDP packets where an ICMP response was received that the remote port was not listening

udp-queries

Number of questions received over UDP

udp-recvbuf-errors

Number of errors caused in the UDP receive buffer

udp-sndbuf-errors

Number of errors caused in the UDP send buffer

udp4-answers-bytes

Total number of answer bytes sent over UDPv4

udp4-answers

Number of answers sent out over UDPv4

udp4-queries

Number of questions received over UDPv4

udp6-answers-bytes

Total number of answer bytes sent over UDPv6

udp6-answers

Number of answers sent out over UDPv6

udp6-in-csum-errors

Number of IPv6 UDP packets received with an invalid checksum

udp6-in-errors

Number of IPv6 UDP packets received faster than the OS could process them

udp6-noport-errors

Number of IPv6 UDP packets where an ICMP response was received that the remote port was not listening

udp6-queries

Number of questions received over UDPv6

udp6-recvbuf-errors

Number of errors caused in the IPv6 UDP receive buffer

udp6-sndbuf-errors

Number of errors caused in the IPv6 UDP send buffer

uptime

Uptime in seconds of the daemon

user-msec

Number of milliseconds spend in CPU ‘user’ time

Ring buffers

Besides counters, PowerDNS also maintains the ringbuffers. A ringbuffer records events, each new event gets a place in the buffer until it is full. When full, earlier entries get overwritten, hence the name ‘ring’.

By counting the entries in the buffer, statistics can be generated. These statistics can currently only be viewed using the webserver and are in fact not even collected without the webserver running.

The following ringbuffers are available:

  • logmessages: All messages logged
  • noerror-queries: Queries for existing records but for a type we don’t have. Queries for, say, the AAAA record of a domain, when only an A is available. Queries are listed in the following format: name/type. So an AAAA query for pdns.powerdns.com looks like pdns.powerdns.com/AAAA.
  • nxdomain-queries: Queries for non-existing records within existing domains. If PowerDNS knows it is authoritative over a domain, and it sees a question for a record in that domain that does not exist, it is able to send out an authoritative ‘no such domain’ message. Indicates that hosts are trying to connect to services really not in your zone.
  • udp-queries: All UDP queries seen.
  • remotes: Remote server IP addresses. Number of hosts querying PowerDNS. Be aware that UDP is anonymous - person A can send queries that appear to be coming from person B.
  • remote-corrupts: Remotes sending corrupt packets. Hosts sending PowerDNS broken packets, possibly meant to disrupt service. Be aware that UDP is anonymous - person A can send queries that appear to be coming from person B.
  • remote-unauth: Remotes querying domains for which we are not authoritative. It may happen that there are misconfigured hosts on the internet which are configured to think that a PowerDNS installation is in fact a resolving nameserver. These hosts will not get useful answers from PowerDNS. This buffer lists hosts sending queries for domains which PowerDNS does not know about.
  • servfail-queries: Queries that could not be answered due to backend errors. For one reason or another, a backend may be unable to extract answers for a certain domain from its storage. This may be due to a corrupt database or to inconsistent data. When this happens, PowerDNS sends out a ‘servfail’ packet indicating that it was unable to answer the question. This buffer shows which queries have been causing servfails.
  • unauth-queries: Queries for domains that we are not authoritative for. If a domain is delegated to a PowerDNS instance, but the backend is not made aware of this fact, questions come in for which no answer is available, nor is the authority. Use this ringbuffer to spot such queries.

Sending metrics to Graphite/Metronome over Carbon

For carbon/graphite/metronome, we use the following namespace. Everything starts with ‘pdns.’, which is then followed by the local hostname. Thirdly, we add ‘auth’ to signify the daemon generating the metrics. This is then rounded off with the actual name of the metric. As an example: ‘pdns.ns1.auth.questions’.

Care has been taken to make the sending of statistics as unobtrusive as possible, the daemons will not be hindered by an unreachable carbon server, timeouts or connection refused situations.

To benefit from our carbon/graphite support, either install Graphite, or use our own lightweight statistics daemon, Metronome, currently available on GitHub.

To enable sending metrics, set carbon-server, possibly carbon-interval and possibly carbon-ourname in the configuration.

Warning

If your hostname includes dots, they will be replaced by underscores so as not to confuse the namespace.

If you include dots in carbon-ourname, they will not be replaced by underscores. As PowerDNS assumes you know what you are doing if you override your hostname.