DNS, security and key ceremonies, oh my! – Part 2

In my last post, I gave a pretty detailed overview of DNS itself and touched on its weaknesses in terms of authenticity and privacy. This time, I’ll go into detail on DNS Security Extensions (DNSSEC) and how it allows clients to establish a chain of authenticity back to an authoritative DNS server, preventing tampering by intermediate parties.

Why is DNSSEC important?

As I mentioned in my last post, DNS by itself is completely unencrypted and unauthenticated. This means that any networks between you and authoritative DNS servers can alter DNS responses, which can facilitate phishing and intercepting basically all traffic headed to a domain. By establishing an authentication chain, DNSSEC eliminates the ability of any intermediate party to alter DNS answers.

DNSSEC Overview

You’ll remember that zones in the DNS hierarchy are denoted by NS records in the parent zone that point to the child zone’s DNS servers. DNSSEC adds some additional records both around these zone boundaries and within the zone itself to build the chain of authenticity. They quite heavily involve public-private key cryptography and digital signatures, so if you need a primer on those, then I’d recommend this Computerphile video on public key cryptography by Rob Miles, and this one on digital signatures by Mike Pound.

Most zones using DNSSEC utilise two keypairs called the Key Signing Key (KSK) and Zone Signing Key (ZSK). It is possible to use one keypair, however using two makes it easier to facilitate the changing of keys over time.

When using two keypairs, all the RRsets in a zone are signed by the ZSK, then the ZSK is signed by the KSK, which is then added to the parent zone. An RRset (resource record set) is simply the group of records for a given name and type.

When a KSK is added to its parent zone, it is in turn signed by the parent’s ZSK, which is signed by the parent’s KSK, and so on up the DNS hierarchy. Thinking about this can be a bit confusing but luckily DNSViz is a great tool for visualising the DNS and DNSSEC hierarchies together.

In order to make sense of the DNSViz diagram, you’ll need to know some of the DNSSEC record types:

  • DNSKEY: These records exist at the top of a zone and store the public part of the KSK and ZSK for that zone. Depending on which domain you’re looking at and when, you might see more than just two DNSKEYs, which usually indicates that the keys are in the process of being changed.

    In DNSViz, KSKs are shown with a grey background.

  • RRSIG: The most fundamental DNSSEC record, RRSIGs hold a signature of an RRset generated with the private key of the KSK or ZSK. Every authoritative RRset1 in a DNSSEC-signed zone has an RRSIG, normally from the ZSK, except for the DNSKEY RRset, which is signed by the KSK.

    RRSIGs also include an inception and expiration timestamp, outside of which they’re invalid. This guards against long-term replay attacks.

    RRSIGs are represented in DNSViz by the arrow between a DNSKEY and RRset.

  • DS: Short for Delegation Signer, these records exist in a parent zone and store a hash of the KSK of a child zone. They have an RRSIG from the parent’s ZSK just like any other record in the zone, and therefore bridge the authenticity chain to the child zone.

    Multiple DS records can exist for a given child zone, either using different hashing algorithms, or representing multiple KSKs in the child zone, for example when the child zone is changing its KSK.

Setting up and maintaining all these records manually would be nearly impossible, so modern DNS server software will generally auto-generate them and keep it all up to date for you, and you just have ask the parent zone to add the appropriate DS record. This is usually done out-of-band, such as through your domain registrar’s website. There are in-band DNS mechanisms to do this, but they’re not very widely supported.

DNSSEC Hierarchy

If you’ve looked at DNSViz, then you’ll have seen that the DNSSEC hierarchy ends much like the DNS hierarchy: in the root zone. The very top is the root KSK, the first of which was generated in mid 2010 and used until late 2018, when it was replaced by the current root KSK. The public component of the root KSK is, like the root server list, pretty much hard-coded into every single DNSSEC-validating resolver on the Internet, so it’s only changed very rarely.

The next step down is the root ZSK. Verisign, the root zone maintainer, generates the root ZSKs, with each one to be used for three months. However in order for DNS servers around the world to trust these keys, they of course need to be signed by the root KSK, which is maintained by ICANN through IANA.

The private part of the root KSK is stored in two redundant and extremely secure facilities in the US. It’s accessed during root key ceremonies(which are every quarter in non-COVID times) to generate a series of signatures authenticating the next root ZSK, valid for about three weeks each. I’ll expand on the root key ceremonies in part 3.

Authenticated non-existence

Here’s an issue you might not have thought about: we can authenticate DNS records that do exist through signatures, but what do we do about DNS records that don’t exist?

For say, the A records of a website, authenticating an empty answer doesn’t really do a whole lot besides maybe allowing the showing of a different error message. If an intermediate party has removed a record from a DNS answer, then we can’t get it back through authentication, and if they wanted to stop you from accessing a particular site, they could just block its DNS outright. In this situation, DoH or DoT could be a more useful fix.

The more important use of authenticated non-existence is at zone boundaries. When resolving, the lack of a DS record in a parent zone is taken to mean that the child does not have DNSSEC enabled. In this case, not authenticating the missing DS would mean that an intermediate party could have removed it from the answer, which would cause the resolver to believe that the child did not use DNSSEC. This bypasses the chain of authenticity, allowing the intermediate party to completely forge the child zone even though it had DNSSEC enabled.

To facilitate authenticated non-existence, there are two more record types: NSEC(next secure) and NSEC3(next secure v3). I’m not going to get too far into the differences between them here as it’s not really necessary to understanding how they work, which is wonderfully simple at its core: every name that does exist in a zone is given an NSEC record, which contains the next name in the zone when in alphabetical order, and also a list of the record types that exist for the current name.

This link to the next name essentially builds a loop around a zone: if you request a name that doesn’t exist, you’ll get an NSEC record (and its RRSIG) guaranteeing that the name doesn’t exist by essentially stating “nothing exists between these two names, and the name you wanted falls in that gap”.

If you request a name that does exist, but not with the type you requested(like the missing DS in the delegation to a non DNSSEC-enabled child), you’ll also get an NSEC and RRSIG stating “you wanted a DS record, but I only have an NS record”. You can see this in DNSViz for a zone that doesn’t have DNSSEC. In this case, the child zones can be said to be covered by the NSEC, meaning they’re not expected to have DNSSEC enabled.

You might have noticed that forming this NSEC loop makes it very easy to find the names that exist in a zone – if you request a name that doesn’t exist, the server will literally tell you two that do. This is called zone walking and while DNS is not at all intended to store secret data, there are some cases where being able to discover the whole zone’s contents is not ideal from a privacy perspective.

This is the main benefit of NSEC3 – it uses hashed versions of the names to make it more difficult to walk the zone. If you’re looking for a more detailed breakdown of NSEC and NSEC3, then RFC7129 is a good place to start.

There are also a few extra approaches to implementing NSEC in environments where the DNS servers are able to generate and sign records on the fly. They allow for complete mitigation of zone walking issues, decreased response sizes, and can reduce database load in environments where DNS records are more fluid than the traditional “zone file” paradigm. If you’re interested, take a look at Cloudflare’s blog article on the techniques they use.

DNSSEC validation

If you simply point a client to a public DNSSEC-validating recursor like 1.1.1.1, 9.9.9.9 or 8.8.8.8, then the recursor does all the DNSSEC validation and the path from the recursor to client is still open to tampering. Using DoH or DoT will fix this(and also give privacy) with the slight caveat that you must trust the recursor’s DNSSEC validation. This is what I do on my home network, as my router is able to act as a stub resolver with DoT.

The solution to the trust issue(if you consider it an issue) is either to run a local DNSSEC-validating recursor(which removes the possibility of using DoH/DoT for privacy) or to run a local DNSSEC-validating stub resolver.

Implementations of validating stub resolvers tend to be somewhat buggy as they’re not widely used, but theoretically a local stub resolver can re-verify DNSSEC on responses it gets from a recursor (optionally over DoH or DoT for privacy).

DNSSEC lookups

The basic sequence of operations for a DNSSEC lookup is the same as I covered for basic DNS in part 1, however all outgoing queries from the recursor will have the EDNS DO or “DNSSEC OK” option set, which tells authoritative servers to include DNSSEC records in their answers, and the recursor will of course have to validate the answers before returning an answer to the client.

DNSSEC behaviour differs slightly between DNS server software, so I’m basing my walk through here on one of the most popular suites, BIND.

On startup, BIND loads both the list of root servers and the root KSK from files. It then conducts a priming query, where it sends two queries to a root server, one asking for the root DNSKEY records and the other for the root NS records. This means that as long as the local files that BIND loaded are current to within a few years, it’ll be able to bootstrap itself into an up-to-date state.

As long as one root server still exists at the IP address recorded in the local file, BIND will be able to ask it for the addresses of the other roots. The root server IP addresses are only changed very rarely.

Similarly with the root KSK, as long as the key recorded in the local file is still in use, then BIND will be able to obtain a new root KSK should it exist, as when the KSK is changed, both keys coexist in the root zone for some period of time with each having an RRSIG from the other to authenticate the transition.

From the priming query, BIND also now has the current root ZSK and is ready to start resolution. BIND will repeat priming queries as needed when their TTL is passed.

As a recursor traverses the DNS hierarchy with the DO option set, it will acquire both the usual NS and other answer records, as well as DS, NSEC and RRSIG records. Once it’s gotten an answer through the recursive process, a recursor will also need to send an additional query to each zone in the current hierarchy for its DNSKEY records.

The recursor now has all the data it needs to validate the answer. There’s three verdicts a recursor can reach:

  • SECURE: The recursor was able to build a complete chain from the root KSK to the final answer. It will return the answer to the client who requested it with another DNS option, AD or “Authentic Data”, set.

  • INSECURE: The recursor couldn’t complete the authentication chain, however the requested name does not have DNSSEC enabled – it’s covered by an NSEC at some point in the hierarchy. It will return this answer to the client who requested it without AD set.

  • BOGUS: The recursor couldn’t complete the chain but expected to be able to – there was no NSEC covering the requested name. This is the situation that would occur if the DNS traffic had been tampered with, and so the recursor returns an empty answer to the client with the SERVFAIL error code.

A note on DNSSEC adoption

Much like IPv6, DNSSEC is an important Internet standard that’s been around in some form since the late 1990s (although admittedly not really a useful one until 2010) yet still faces dismal adoption in the real world. Outside of some Scandinavian countries (who offer discounts on domain registration for using DNSSEC), the rate of DNSSEC-enabled domains is usually less than 5%. DNSSEC validating resolvers are slightly more commonplace at around 25%.

With negative UI indicators and free certificates driving deployment of HTTPS, and the IPv4-pocalypse (slowly) driving IPv6 adoption, I don’t think it’s too far-fetched to think that at some point browser and device makers will add mechanisms to push DNSSEC adoption – Chrome and Firefox already have DoH auto-upgrade, so they’re clearly keen to make progress in this space.

If you’re starting with a new domain, or looking to move your DNS hosting to a new provider, it’s worth putting in the little bit of effort to find a provider which supports DNSSEC, and not just because of some vague potential for issues in the future that I’m theorising on, but because it offers a great(and usually quite easy) improvement to the security of your domain, whether or not you handle sensitive information (going back to the same idea as Does my site need HTTPS?).

A lot of the big providers do support DNSSEC, so have a look at your current provider – it could be as simple as a single option!

And if you’re interested to see if your device is receiving DNSSEC-validated answers, then have a look at the great Internet.nl tool from the Dutch Internet Standards Platform. The site also has tools to test a domain name, in addition to DNSViz.

Again, I’m open to answering questions in the comments or on Twitter. I’ll be back with part 3 soon :)

  1. This note will probably make more sense once you’ve read the whole post, but there’s a nuance here:

    In general, only the authoritative nameservers for a given zone can contain records for that zone. For example, the nameserver for xyz won’t return records for blog.tugzrida.xyz, because tugzrida.xyz is a new zone, delegated to a different server.

    What the xyz zone does need to contain is NS, DS, NSEC, and possibly glue records for tugzrida.xyz, however it’s not really considered authoritative for them – the authoritative records for tugzrida.xyz are in tugzrida.xyz’s authoritative nameservers.

    Because they’re not authoritative for them, parent zones will not contain RRSIG records covering a child’s NS or glue records. An exception from this rule is made for a child’s DS and NSEC records as they must be signed in order to maintain the chain of authenticity. If the key in the DS matches the key in the child’s DNSKEY, and that key is then used to sign the zone, then the NS records must be correct, or at the very worst, pointing to the wrong servers but which happen to be serving the unaltered zone anyway. ↩︎

Comments