Amazon Route 53 aims to provide low-latency DNS resolution to clients, in party by broadcasting IP addresses using anycast from its 50+ edge locations scattered around the globe. Anycast routes packets to the closest network location that is “advertising” a specific IP address in order to lower latency. It thereby ensures that queries to nameservers land in the closest location to the end user, out of all the locations responsible for advertising the underlying IP address.
Amazon has thousands of nameserver names and organizes all the nameservers in one top-level domain into “stripes,” which is where shuffle sharding comes in: “each Route 53 domain (hosted zone) receives four nameserver names one from each of stripe. As a result, it is unlikely that two zones will overlap completely across all four nameservers. In fact, we enforce a rule during nameserver assignment that no hosted zone can overlap by more than two nameservers with any previously created hosted zone.”
Each Route 53 edge location is charged with serving traffic for at least one stripe. For example, Amazon’s Sydney edge location could serve both the .com and .net stripes and stripes can be served by more than one location at a time. Resolvers resolve queries against certain nameservers by going to the closest location serving the relevant stripe: “For any given domain, in general, resolvers learn the lowest latency nameserver based upon the round trip time of the query (this technique is often called SRTT or smooth round-trip time). Over a few queries, a resolver in Australia would gravitate toward using the nameservers on the .net and .com stripes for Route 53 customers’ domains.”
Some resolvers choose randomly, though in total, about 80% of resolvers use the lowest RTT nameserver. Even so, AWS has refrained from advertising all four stripes from every edge location, “because edge locations can sometimes fail to provide resolution for a variety of reasons that are very hard to control: the edge location may lose power or Internet connectivity, the resolver may lose connectivity to the edge location, or an intermediary transit provider may lose connectivity.” Large-scale DDoS attacks are also a liability to provide for. Amazon distributes advertising for nameservers over various locations because it is risky to put all your eggs in one basket.