Intercepting queries with Lua

To get a quick start, we have supplied a sample script that showcases all functionality described below.

Queries can be intercepted in many places:

Writing Lua PowerDNS Recursor scripts

Addresses and DNS Names are not passed as strings but as native objects. This allows for easy checking against Netmasks and domain sets. It also means that to print such names, the :toString method must be used (or even :toStringWithPort for addresses).

Once a script is loaded, PowerDNS looks for the interception functions in the loaded script. All of these functions are optional.

If ipfilter returns true, the query is dropped. If preresolve returns true, it will indicate it handled a query, and the recursor will send the result as constructed in the function to the client. If it returns false, the Recursor will continue processing. For the other functions, the return value will indicate that an alteration to the result has been made. In that case the potentially changed rcode, records and policy will be processed and DNSSEC validation will be automatically disabled since the content might not be genuine anymore. At specific points the Recursor will check if policy handling should take place. These points are immediately after preresolve, after resolving and after nxdomain, nodata and postresolve.

Interception Functions

ipfilter(remoteip, localip, dh) → bool

This hook gets queried immediately after consulting the packet cache, but before parsing the DNS packet. If this hook returns something else than false, the packet is dropped. However, because this check is after the packet cache, the IP address might still receive answers that require no packet parsing.

With this hook, undesired traffic can be dropped rapidly before using precious CPU cycles for parsing. As an example, to filter all queries coming from 1.2.3.0/24, or with the AD bit set:

badips = newNMG()
badips:addMask("1.2.3.0/24")

function ipfilter(rem, loc, dh)
    return badips:match(rem) or dh:getAD()
end

This hook does not get the full DNSQuestion object, since filling out the fields would require packet parsing, which is what we are trying to prevent with this function.

Parameters:
  • remoteip (ComboAddress) – The IP(v6) address of the requestor
  • localip (ComboAddress) – The address on which the query arrived.
  • dh (DNSHeader) – The DNS Header of the query.
gettag(remote, ednssubnet, localip, qname, qtype, ednsoptions, tcp, proxyprotocolvalues) → multiple values
gettag(remote, ednssubnet, localip, qname, qtype, ednsoptions, tcp) → int
gettag(remote, ednssubnet, localip, qname, qtype, ednsoptions) → int

Changed in version 4.1.0: The tcp parameter was added.

Changed in version 4.4.0: The proxyprotocolvalues parameter was added.

The gettag() function is invoked when Recursor attempts to discover in which packetcache an answer is available.

This function must return an unsigned 32-bit integer, which is the tag number of the packetcache. The tag is used to partition the packet cache. The default tag (when gettag() is not defined) is zero. If gettag() throws an exception, the zero tag is used.

In addition to the tag, this function can return a table of policy tags and a few more values to be passed to the resolving process. The resulting tag number can be accessed via dq.tag in the preresolve() hook, and the policy tags via dq:getPolicyTags() in every hook.

New in version 4.1.0: It can also return a table whose keys and values are strings to fill the DNSQuestion.data table, as well as a requestorId value to fill the DNSQuestion.requestorId field and a deviceId value to fill the DNSQuestion.deviceId field.

New in version 4.3.0: Along the deviceId value that can be returned, it was added a deviceName field to fill the DNSQuestion.deviceName field.

New in version 4.4.0: A routingTag can be returned, which is used as an extra name to identify records in the record cache. If a routing tag is set and a record would be stored with an ENDS subnetmask in the record cache, it will be stored with the tag instead. New request using the same tag will be served by the record in the records cache, avoiding querying authoritative servers.

The tagged packetcache can e.g. be used to answer queries from cache that have e.g. been filtered for certain IPs (this logic should be implemented in gettag()). This ensure that queries are answered quickly compared to setting dq.variable to true. In the latter case, repeated queries will not be found in the packetcache and pass through the entire resolving process, and all relevant Lua hooks wil be called.

Parameters:
  • remote (ComboAddress) – The sender’s IP address
  • ednssubnet (Netmask) – The EDNS Client subnet that was extracted from the packet
  • localip (ComboAddress) – The IP address the query was received on
  • qname (DNSName) – The domain name the query is for
  • qtype (int) – The query type of the query
  • ednsoptions – A table whose keys are EDNS option codes and values are EDNSOptionView objects. This table is empty unless the gettag-needs-edns-options option is set.
  • tcp (bool) – Added in 4.1.0, a boolean indicating whether the query was received over UDP (false) or TCP (true).
  • proxyprotocolvalues – Added in 4.4.0, a table of ProxyProtocolValue objects representing the Type-Length Values received via the Proxy Protocol, if any.
Returns:

tag [, policyTags [, data [, reqId [, deviceId [, deviceName [, routingTag ]]]]]]

gettag_ffi(param) → optional Lua object

New in version 4.1.2.

Changed in version 4.3.0: The ability to craft answers was added.

This function is the FFI counterpart of the gettag() function, and offers the same functionality. It accepts a single parameter which can be accessed and modified using FFI accessors.

Like the non-FFI version, it has the ability to set a tag for the packetcache, policy tags, a routing tag, the DNSQuestion.requestorId and DNSQuestion.deviceId values and to fill the DNSQuestion.data table. It also offers ways to mark the answer as variable so it’s not inserted into the packetcache, to set a cap on the TTL of the returned records, and to generate a response by adding records and setting the RCode. It can also instruct the recursor to do a proper resolution in order to follow any CNAME records added in this step.

If this function does not set the tag or an exception is thrown, the zero tag is assumed.

prerpz(dq) → bool

This hook is called before any filtering policy have been applied, making it possible to completely disable filtering by setting dq.wantsRPZ to false. Using the dq:discardPolicy() function, it is also possible to selectively disable one or more filtering policy, for example RPZ zones, based on the content of the dq object. Currently, the return value of this function is ignored.

As an example, to disable the “malware” policy for example.com queries:

function prerpz(dq)
  -- disable the RPZ policy named 'malware' for example.com
  if dq.qname:equal('example.com') then
    dq:discardPolicy('malware')
  end
  return false
end
Parameters:dq (DNSQuestion) – The DNS question to handle
preresolve(dq) → bool

This function is called before any DNS resolution is attempted, and if this function indicates it, it can supply a direct answer to the DNS query, overriding the internet. This is useful to combat botnets, or to disable domains unacceptable to an organization for whatever reason.

Parameters:dq (DNSQuestion) – The DNS question to handle
postresolve(dq) → bool

is called right before returning a response to a client (and, unless dq.variable is set, to the packet cache too). It allows inspection and modification of almost any detail in the return packet.

Parameters:dq (DNSQuestion) – The DNS question to handle
postresolve_ffi(handle) → bool

New in version 4.7.0.

This is the FFI counterpart of postresolve(). It accepts a single parameter which can be passed to the functions listed in Lua FFI API. The accessor functions retrieve and modify various aspects of the answer returned to the client.

nxdomain(dq) → bool

is called after the DNS resolution process has run its course, but ended in an ‘NXDOMAIN’ situation, indicating that the domain does not exist. Works entirely like postresolve(), but saves a trip through Lua for answers which are not NXDOMAIN.

Parameters:dq (DNSQuestion) – The DNS question to handle
nodata(dq) → bool

is just like nxdomain(), except it gets called when a domain exists, but the requested type does not. This is where one would implement DNS64.

Parameters:dq (DNSQuestion) – The DNS question to handle
preoutquery(dq) → bool

This hook is not called in response to a client packet, but fires when the Recursor wants to talk to an authoritative server.

When this hook sets the special result code -3, the whole DNS client query causing this outgoing query gets a ServFail.

However, this function can also return records like preresolve().

Parameters:dq (DNSQuestion) – The DNS question to handle.

In the case of preoutquery(), only a few attributes if the dq object are filled in:

Do not rely on other attributes having a value and do not call any method of the dq object apart from the record set manipulation methods.

policyEventFilter(event) → bool

New in version 4.4.0.

This hook is called when a filtering policy has been hit, before the decision has been applied, making it possible to change a policy decision by altering its content or to skip it entirely. Using the event:discardPolicy() function, it is also possible to selectively disable one or more filtering policy, for example RPZ zones. The return value indicates whether the policy hit should be completely ignored (true) or applied (false), possibly after editing the action to take in that latter case (see Modifying Policy Decisions below). when true is returned, the resolution process will resume as if the policy hit never took place.

Parameters:event (PolicyEvent) – The event to handle

As an example, to ignore the result of a policy hit for the example.com domain:

function policyEventFilter(event)
  if event.qname:equal("example.com") then
    -- ignore that policy hit
    return true
  end
  return false
end

To alter the decision of the policy hit instead:

function policyEventFilter(event)
  if event.qname:equal("example.com") then
    -- replace the decision with a custom CNAME
    event.appliedPolicy.policyKind = pdns.policykinds.Custom
    event.appliedPolicy.policyCustom = "example.net"
    -- returning false so that the hit is not ignored
    return false
  end
  return false
end

Callback Semantics

The functions which modify or influence the query flow should all return true when they have performed an action which alters the rcode, result or applied policy. When a function returns false, the nameserver will process the query normally until a new function is called.

ipfilter() and preresolve() callbacks must return true if they have taken over the query and wish that the nameserver should not proceed with processing.

If a function has taken over a request, it can set an rcode (usually 0), and specify a table with records to be put in the answer section of a packet. An interesting rcode is NXDOMAIN (3, or pdns.NXDOMAIN), which specifies the non-existence of a domain. Instead of setting an rcode and records, it can also set fields in the applied policy to influence further processing.

The ipfilter() and preoutquery() hooks are different, in that ipfilter() can only return a true of false value, and that preoutquery() can also set rcode -3 to signify that the whole query should be terminated.

The policyEventFilter() has a different meaning as well, where returning true means that the policy hit should be ignored and normal processing should be resumed.

A minimal sample script:

function nxdomain(dq)
    print("Intercepting NXDOMAIN for: ",dq.qname:toString())
    if dq.qtype == pdns.A
    then
        dq.rcode=0 -- make it a normal answer
        dq:addAnswer(pdns.A, "192.168.1.1")
        return true
    end
    return false
end

Warning: Please do NOT use the above sample script in production! Responsible NXDomain redirection requires more attention to detail.

Useful rcodes include 0 or pdns.NOERROR for no error and pdns.NXDOMAIN for NXDOMAIN. Before 4.4.0, pdns.DROP can also be used to drop the question without any further processing. Such a drop is accounted in the policy-drops metric.

Starting with recursor 4.4.0, the method to drop a request is to set the dq.appliedPolicy.policyKind to the value pdns.policykinds.Drop.

function nxdomain(dq)
    print("Intercepting and dropping NXDOMAIN for: ",dq.qname:toString())
    if dq.qtype == pdns.A
    then
        dq.appliedPolicy.policyKind = pdns.policykinds.Drop
    end
    return false
end

Note: to drop a query set policyKind and return false, to indicate the Recursor should process the Drop action.

DNS64

The getFakeAAAARecords and getFakePTRRecords followupFunctions can be used to implement DNS64. See DNS64 support for more information.

To get fake AAAA records for DNS64 usage, set dq.followupFunction to getFakeAAAARecords, dq.followupPrefix to e.g. “64:ff9b::” and dq.followupName to the name you want to synthesize an IPv6 address for.

For fake reverse (PTR) records, set dq.followupFunction to getFakePTRRecords and set dq.followupName to the name to look up and dq.followupPrefix to the same prefix as used with getFakeAAAARecords.

Follow up actions

When modifying queries, it might be needed that the Recursor does some extra work after the function returns. The dq.followupFunction can be set in this case.

CNAME chain resolution

It may be useful to return a CNAME record for Lua, and then have the PowerDNS Recursor continue resolving that CNAME. This can be achieved by setting dq.followupFunction to followCNAMERecords and dq.followupDomain to “www.powerdns.com”. PowerDNS will do the rest.

UDP Query Response

The udpQueryResponse dq.followupFunction allows you to query a simple key-value store over UDP asynchronously.

Several dq variables can be set:

  • dq.udpQueryDest: destination IP address to send the UDP packet to
  • dq.udpQuery: The content of the UDP payload
  • dq.udpCallback: The name of the callback function that is called when an answer is received

The callback function must accept the dq object and can find the response to the UDP query in dq.udpAnswer.

In this callback function, dq.followupFunction can be set again to any of the available functions for further processing.

This example script queries a simple key/value store over UDP to decide on whether or not to filter a query:

--[[
This implements a two-step domain filtering solution where the status of an IP address
and a domain name need to be looked up.
To do so, we use the udpQuestionResponse answers which generically allows us to do asynchronous
lookups via UDP.
Such lookups can be slow, but they won't block PowerDNS while we wait for them.

To benefit from this hook,
..

To test, use the 'kvresp' example program provided.
--]]

function preresolve (dq)
    pdnslog("preresolve handler called for: "..dq.remoteaddr:toString()..", local: "..dq.localaddr:toString()..", "..dq.qname:toString()..", "..dq.qtype)
    dq.followupFunction="udpQueryResponse"
    dq.udpCallback="gotdomaindetails"
    dq.udpQueryDest=newCA("127.0.0.1:5555")
    dq.udpQuery = "DOMAIN "..dq.qname:toString()
    return true;
end

function gotdomaindetails(dq)
    pdnslog("gotdomaindetails called, got: "..dq.udpAnswer)

    if(dq.udpAnswer == "0")
    then
        pdnslog("This domain needs no filtering, not looking up this domain")
        dq.followupFunction=""
        return false
    end
    pdnslog("Domain might need filtering for some users")
    dq.variable = true -- disable packet cache

    local data={}
    data["domaindetails"]= dq.udpAnswer
    dq.data=data
    dq.udpQuery="IP "..dq.remoteaddr:toString()
    dq.udpCallback="gotipdetails"
    pdnslog("returning true in gotipdetails")
    return true
end

function gotipdetails(dq)
    dq.followupFunction=""
    pdnslog("So status of IP is "..dq.udpAnswer.." and status of domain is "..dq.data.domaindetails)

    if(dq.data.domaindetails=="1" and dq.udpAnswer=="1")
    then
        pdnslog("IP wants filtering and domain is of the filtered kind")
        dq:addAnswer(pdns.CNAME, "blocked.powerdns.com")
        return true
    else
        pdnslog("Returning false (normal resolution should proceed, for this user)")
        return false
    end
end

Example Script

pdnslog("pdns-recursor Lua script starting!", pdns.loglevels.Warning)

blockset = newDS()
blockset:add{"powerdns.org", "xxx"}

dropset = newDS()
dropset:add("123.cn")

malwareset = newDS()
malwareset:add("nl")

magic2 = newDN("www.magic2.com")

magicMetric = getMetric("magic")

badips = newNMG()
badips:addMask("127.1.0.0/16")

-- this check is applied before any packet parsing is done
function ipfilter(rem, loc, dh)
  pdnslog("ipfilter called, rem: "..rem:toStringWithPort().." loc: "..loc:toStringWithPort().." match:"..tostring(badips:match(rem)))
  pdnslog("id: "..dh:getID().." aa: "..tostring(dh:getAA()).." ad: "..tostring(dh:getAD()).." arcount: "..dh:getARCOUNT())
  pdnslog("ports: "..rem:getPort().." "..loc:getPort())
  return badips:match(rem)
end

-- shows the various ways of blocking, dropping, changing questions
-- return false to say you did not take over the question, but we'll still listen to 'variable'
-- to selectively disable the cache
function preresolve(dq)
  pdnslog("Got question for "..dq.qname:toString().." from "..dq.remoteaddr:toString().." to "..dq.localaddr:toString())

  local ednssubnet = dq:getEDNSSubnet()
  if ednssubnet then
    pdnslog("Packet EDNS subnet source: "..ednssubnet:toString()..", "..ednssubnet:getNetwork():toString())
  end

  local a = dq:getEDNSOption(3)
  if a then
    pdnslog("There is an EDNS option 3 present: "..a)
  end

  loc = newCA("127.0.0.1")
  if dq.remoteaddr:equal(loc) then
    pdnslog("Query from loopback")
  end

  -- note that the comparisons below are CaSe InSensiTivE and you don't have to worry about trailing dots
  if dq.qname:equal("magic.com") then
    magicMetric:inc()
    pdnslog("Magic!")
  else
    pdnslog("not magic..")
  end

  if dq.qname == magic2 then
    pdnslog("Faster magic") -- compares against existing DNSName
  end

  if blockset:check(dq.qname) then
    dq.variable = true      -- disable packet cache in any case
    if dq.qtype == pdns.A then
      dq:addAnswer(pdns.A, "1.2.3.4")
      dq:addAnswer(pdns.TXT, "\"Hello!\"", 3601) -- ttl
      return true
    end
  end

  if dropset:check(dq.qname) then
   pdnslog("dopping query")
   dq.appliedPolicy.policyKind = pdns.policykinds.Drop
   return false -- recursor still needs to handle the policy
  end

  if malwareset:check(dq.qname) then
    dq:addAnswer(pdns.CNAME, "blog.powerdns.com.")
    dq.rcode = 0
    dq.followupFunction = "followCNAMERecords"    -- this makes PowerDNS lookup your CNAME
    return true
  end

  return false
end

-- this implements DNS64

function nodata(dq)
  if dq.qtype == pdns.AAAA then
    dq.followupFunction = "getFakeAAAARecords"
    dq.followupName = dq.qname
    dq.followupPrefix="fe80::"
    return true
  end

  if dq.qtype == pdns.PTR then
    dq.followupFunction = "getFakePTRRecords"
    dq.followupName = dq.qname
    dq.followupPrefix = "fe80::"
    return true
  end
  return false
end

-- postresolve runs after the packet has been answered, and can be used to change things
-- or still drop
function postresolve(dq)
  pdnslog("postresolve called for "..dq.qname:toString())
  local records = dq:getRecords()
  for k,v in pairs(records) do
    pdnslog(k.." "..v.name:toString().." "..v:getContent())
    if v.type == pdns.A and v:getContent() == "185.31.17.73" then
      pdnslog("Changing content!")
      v:changeContent("130.161.252.29")
      v.ttl = 1
    end
  end
  dq:setRecords(records)
  return true
end

nxdomainsuffix = newDN("com")

function nxdomain(dq)
  pdnslog("nxdomain called for: "..dq.qname:toString())
  if dq.qname:isPartOf(nxdomainsuffix) then
    dq.rcode = 0 -- make it a normal answer
    dq:addAnswer(pdns.CNAME, "ourhelpfulservice.com")
    dq:addAnswer(pdns.A, "1.2.3.4", 60, "ourhelpfulservice.com")
    return true
  end
  return false
end

Dropping all traffic from botnet-infected users

Frequently, DoS attacks are performed where specific IP addresses are attacked, often by queries coming in from open resolvers. These queries then lead to a lot of queries to ‘authoritative servers’ which actually often aren’t nameservers at all, but just targets of attack.

This specific script is, as of January 2015, useful to prevent traffic to ezdns.it related traffic from creating CPU load. This script requires PowerDNS Recursor 4.x or later.

lethalgroup=newNMG()
lethalgroup:addMask("192.121.121.0/24") -- touch these nameservers and original query gets dropped

function preoutquery(dq)
    print("pdns wants to ask "..dq.remoteaddr:toString().." about "..dq.qname:toString().." "..dq.qtype.." on behalf of requestor "..dq.localaddr:toString())
    if(lethalgroup:match(dq.remoteaddr))
    then
        print("We matched the group "..lethalgroup:tostring().."! killing query dead from requestor "..dq.localaddr:toString())
        dq.rcode = -3 -- "kill"
        return true
    end
    return false
end

Modifying Policy Decisions

The PowerDNS Recursor has a policy engine based on Response Policy Zones (RPZ). Starting with version 4.0.1 of the recursor, it is possible to alter this decision inside the Lua hooks.

If the decision is modified in a Lua hook, false should be returned, as the query is not actually handled by Lua so the decision is picked up by the Recursor.

Before 4.4.0, the result of the policy decision is checked after preresolve() and postresolve(). Beginning with version 4.4.0, the policy decision is checked after preresolve() and any policyEventFilter() call instead.

For example, if a decision is set to pdns.policykinds.NODATA by the policy engine and is unchanged in preresolve(), the query is replied to with a NODATA response immediately after preresolve().

Example script

-- This script demonstrates modifying policies for versions before 4.4.0.
-- Starting with 4.4.0, it is preferred to use a policyEventFilter.
-- Dont ever block my own domain and IPs
myDomain = newDN("example.com")

myNetblock = newNMG()
myNetblock:addMasks({"192.0.2.0/24"})

function preresolve(dq)
  if dq.qname:isPartOf(myDomain) and dq.appliedPolicy.policyKind ~= pdns.policykinds.NoAction then
    pdnslog("Not blocking our own domain!")
    dq.appliedPolicy.policyKind = pdns.policykinds.NoAction
  end
  return false
end

function postresolve(dq)
  if dq.appliedPolicy.policyKind ~= pdns.policykinds.NoAction then
    local records = dq:getRecords()
    for k,v in pairs(records) do
      if v.type == pdns.A then
        local blockedIP = newCA(v:getContent())
        if myNetblock:match(blockedIP) then
          pdnslog("Not blocking our IP space")
          dq.appliedPolicy.policyKind = pdns.policykinds.NoAction
        end
      end
    end
  end
  return false
end

SNMP Traps

PowerDNS Recursor, when compiled with SNMP support, has the ability to act as a SNMP agent to provide SNMP statistics and to be able to send traps from Lua.

For example, to send a custom SNMP trap containing the qname from the preresolve hook:

function preresolve(dq)
  sendCustomSNMPTrap('Trap from preresolve, qname is '..dq.qname:toString())
  return false
end

Maintenance callback

Starting with version 4.2.0 of the recursor, it is possible to define a maintenance() callback function that will be called periodically. This function expects no argument and doesn’t return any value.

function maintenance()
    -- This would be called every second
    -- Perform here your maintenance
end

The interval can be configured through the lua-maintenance-interval setting.