ISC-TN-2006-2 B. Reid ISC December 22, 2006 A DNS performance testbed design Copyright Notice Copyright (C) 2006 Internet Systems Consortium, Inc. All Rights Reserved. Abstract Internet Systems Consortium (ISC) is building a testbed that will be used to make full-scale measurements of the performance of DNS servers and protocols on the same scale as global root and TLD servers. This brief document describes the design of that testbed and explains the various design decisions. This work is sponsored in part by the National Science Foundation via grant NSF OARC DNS SCI CISE SCI-0427144. 1. Introduction ISC has received an NSF subcontract from CAIDA at UC San Diego to design and build a DNS testbed and to perform certain tests with it. That subcontract has a fixed budget for the servers that form the testbed. Our design challenge is to find a way to make a full-scale testbed that can process DNS requests at rates comparable to those seen by the most heavily-used server complexes, but make it out of commodity server components so that not only are measured transactions per second adequate, but transactions per second per dollar are very low. An interesting side effect of this low-cost testbed requirement is that we will be able to measure whether a certain protocol is "too expensive": if it will run at adequate speed on our testbed, then by definition it is not too expensive. We are therefore endeavoring to produce the highest performance testbed that can be built within the budget. Most of that performance will come from optimally-tuned server computers (see http://new.isc.org/proj/dnsperf/memtest.html) but the network and surrounding gantries must not degrade the server performance. 2. Design conditions and preconditions A DNS reply datagram has room for 13 NS records in its Answer section. This means that name service for a zone can be provided by as many as 13 different servers. Our testbed will therefore contain 13 name servers. There are techniques based on 'anycast' that can be used to implement multiple copies of a single IP address, but there have been very few studies of the reliability of anycast mechanisms, and their value decreases with the frequency of update of the zone being served. We shall not consider anycast in our testbed architecture, because we hope to be able to measure the effects of many updates per second; that rate would make anycast mirroring impossible. Servers in global-scale name server complexes typically have two network connections. One of those is the address advertised to the public; it is used for answering name service requests. The other is for administration of the server: updating zone files or software or performing maintenance tasks. Our servers will, for this reason, each have two network connections. We wish to make each server be as fast as possible while staying within budget. The BIND server software has in our measurements been limited by memory bandwidth and processor speed. If there is enough RAM on a server it will never need to reference its disk, and the speed at which it can answer queries is dependent on the speed at which it can access and process the data in its RAM. So, in general, the best server computer to use will be one that has as much RAM as possible, and memory access as fast as possible. Current midrange memory management chips can handle up to 16GB on one computer, so that would be the amount of RAM to expect. Economy of maintenance argues strongly that all of the servers be sufficiently alike to have interchangeable parts. Tractability of statistical analysis argues strongly that all 13 of the DNS servers and 1 spare benchmark at the same speed, so that "3 servers" will mean something. Taken together, this means that it would be best to have 18 identical servers, though it is less important that the load generators be identical and not at all important that the updater match. Depending on price, we can get by with 14 high-performance servers and 4 utility servers. 3. Testbed structure Our test data set (http://new.isc.org/proj/dnsperf/datastream.html) averages 2400 queries per second per server, and bursts to about 4000 queries per second per server. A single load generator computer can send DNS queries at about 100,000 per second; delivering 4000 queries per second to each of 13 servers requires an aggregate rate of only 52,000 queries per second. With three load generators we will be able to produce test loads that are up to 6 times greater than our test data set's maximum burst rate, and 9 times greater than its average rate. To route queries from 3 generators to 13 servers, a high-performance layer-2 switch is required. Because the load generators might have to present load faster than 100MBPS, the switch will need gigabit capability and be non-blocking. In this application there is no need for a managed switch (it is not remote). Several vendors sell inexpensive non-blocking 24-port gigabit ethernet switches; we do not consider its selection to be difficult. All modern servers of this class come equipped with two gigabit ethernet ports, and gigabit patch cables are not at all expensive. We therefore will use gigabit interconnects everywhere. In the diagram below, the two ethernet switches are 24-port non-blocking unmanaged switches, the 13 DNS servers and their understudy are whatever machine we select from our benchmarks, and the remaining 4 are whatever we can cobble together. (see http://www.isc.org/pubs/tn/Testbed.png) If future applications of this testbed require that there be variable propagation delays to the 13 DNS servers (to simulate geographic distribution) it is straightforward to add delay lines in the ethernet cabling between the switch and the servers themselves. 4. Measurement There is no tap point in this architecture at which a passive monitor can record all network data, and passive monitors cannot determine the internal details of server performance, which is one of the primary intended uses of the testbed. Therefore the data will be stored on disk inside each server during an experiment and then copied out to the Updater server once the experiment has ended. From there, it will be copied to workstations on the local network for analysis and archiving. The amount of data logging needed in each server will depend on the design of the experiment. It is up to the experiment designer to understand and possibly compensate for the effects of the logging on server performance. Author's Address Brian Reid Internet Systems Consortium, Inc. 950 Charter Street Redwood City, CA 94063 US Phone: +1 650 423-1327 EMail: Brian_Reid@isc.org