DNS Failure in Web Test but not in DNS Experience Test

Prev Next

Overview

In rare cases, DNS failures may occur for Web tests but not for DNS experience tests. Currently DNS Experience does not cache results from previous test runs (except TLD records). Hence it does not use the records provided from one query on the next query. However, for the Web tests and API we rely on a DNS resolver - which does cache records based on its TTL and re-uses them over time.

Explanation

A common cause of such failures is an inadvertently poisined cache with NS records pointing to servers that are not working. Here is a simple example:

DNS Resolver queries the GLTD servers for the domain, the answer includes two Authoritative NS records with a TTL of 300 seconds:
ns00.example.net   12.34.56.78
ns01.example.net   23.45.67.89

Next, the DNS resolver queries one of the Authoritative NS servers, and gets back not only the A records for the domain, but also the Authoritative NS records (which will override the ones from the GTLD):
ns00.example.net   12.34.56.78
ns01.example.net   23.45.67.89
ns02.example.net   34.56.78.90

In this case the third NS record is to an IP that is not responding to DNS query. The DNS resolver every once in a while ends up using this authoritative NS record from its cache, and it fails to resolve the domain. It will then either retry on the next NS record (so slower DNS), or it might simply fail (if the other records have expired).