If you have problems the first few times you set up DNS, you are not alone. Most people run into trouble at some point, until they've worked enough with DNS to figure out where problems are likely to hide. In this section, we show you some of the most common DNS-related problems and offer some tips on getting your service running again.
Caution |
If you use the Red Hat GUI configuration tool redhat-config-bind, you may run into serious trouble. This tool overwrites your regular files and can make it difficult to diagnose problems. In addition, once you commit to working with the GUI tool, you cannot return to DNS configuration at the command line. (Think of it like switching to synthetic oil in your car.) Other GUI tools may hide crucial data in nonstandard locations or might even fail to parse all the options available in the service. We strongly recommend that you work with BIND9 and DNS services at the command line with a text editor. |
Luckily, most DNS problems can be resolved with regular command line programs. Traditionally, the nslookup program has been the primary troubleshooting choice, but the newer program dig has quickly supplanted it. The output from dig provides a great deal of information that can help you fix your DNS server issues quickly and accurately.
For example, you can use dig to "walk" the DNS tree for a given domain, as demonstrated earlier in this chapter. You can also use dig to complete entire zone transfers from a specified name server, which is a very useful tool (or security check). To do so, issue this command:
# dig example.com axfr @10.1.1.1 ; <<>> DiG 9.2.2-P3 <<>> example.com axfr @10.1.1.1 ;; global options: printcmd example.com. 86400 IN SOA example.com. tom.yahoo.com.example.com. 2004011824 10800 900 604800 86400 example com. 86400 IN A 10.1.1.1 example.com . 86400 IN NS ns.example.com. example.com. 86400 IN NS ns2.example.com. ftp example.com. 86400 IN CNAME example.com. mail.example.com . 86400 IN CNAME example.com. ns.example.com. 86400 IN A 10.1.1.) ns2.example.com. 86400 IN A 192.168.128.3 webdav.example.com. 86400 IN CNAME example.com. www.example.com. 86400 IN CNAME example.com. example.com. 86400 IN SOA example.com. tom.yahoo.com.example.com. 2004011824 10800 900 604800 86400 ;; Query time: 3 msec ;; SERVER: 10.1.1.1#53(10.1.1.1) ;; WHEN: Mon Jan 19 01:21:40 2004 ;; XFR size: 12 records
While this is useful if you are the administrator of example. com, think how much trouble this could cause if someone else was able to suck down all your unsecured reverse DNS records. Handing over a complete zone record, which contains every IP address on the network is not high on our list of Secure Administrative Policies.
Finding DNS problems can be a bit tricky. You need to think about the tools you use, the settings in /etc/named.conf versus settings in zone files or slave server settings, the zone transfer settings you've chosen, and what you're resolving against. Not to mention things like iptables and firewalling rules!
The entries in the /var/log/messages log file can be very helpful in narrowing down possible solutions. The remainder of this section offers solutions to common DNS service problems.
One common DNS problem involves slave name servers. If you change the master example.com.zone file and restart the service, but the slave name server does not also update itself, external DNS requests might fail or receive the wrong information. To solve this question, think about how the master server works.
When a zone file is changed and named restarts, the daemon sends a NOTIFY command that should trigger the slave server to restart itself as well. To see whether your named did this, check your log files:
# tail /var/log/messages ... Jan 19 02:33:47 localhost named[7665]: zone example.com/IN: loaded serial 2004011825 Jan 19 02:33:47 localhost named[7665]: zone localhost/IN: loaded serial 42 Jan 19 02:33:47 localhost named[7665]: running Jan 19 02:33:47 localhost named[7665]: zone example.com/IN: sending notifies (serial 2004011825)
If the NOTIFY command was executed properly, the next line in the log should have been
Jan 19 02:20:58 localhost named[7528]: client 192.168.128.3#33301: transfer of 'example.com/IN': AXFR -style IXFR started
Since this line did not display, the slave server at 192.168.128.3 did not perform a zone transfer. Thus, there's a problem. Perhaps the slave server can't find the master server or there is another configuration error. Use dig to trace the existing configuration:
# dig ns2.example.com @10.1.1.1 ... ;; ANSWER SECTION: ns2.example.com. 86400 IN A 192.168.128.3 ;; AUTHORITY SECTION: example.com. 86400 IN NS ns.example.com. example.com. 86400 IN NS ns2.example.com.example.com.
The A record is fine, but note the oddness in the NS records. Why is there a double domain error here? Open your zone file, and you'll see the problem:
2004011825 ; 3H ; refresh 15M ; retry 1W ; expiry 1D ) ; minimum @ 1D IN NS ns.example.com. 1D IN NS ns2.example.com
There it is, on the last line: rather, there it isn't. Remember that you need to supply a trailing clot for every domain name. Since this entry doesn't have a trailing dot, the NS record is broken and your slave server can't update. Simply add the dot, save the file, and restart the service again.
With the explosion of domain name registrars across the world, the simple whois command isn't as immediately helpful as it used to be. For general use, whois is used with this syntax:
# whois domain-name
as in
# whois wiley.com Domain Name: WILEY.COM Registrar: REGISTER.COM, INC. Whois Server: whois.register.com Referral URL: http://www.register.com Name Server: JWS-EDCP.WILEY.COM Name Server: NS1.WILEYPUB.COM Status: ACTIVE Updated Date: 21-nov-2003 Creation Date: 12-oct-1994 Expiration Date: 11-oct-2011
However, simple whois is reliable only for domain names in the . com,.net, and .edu TLD. To get a more accurate report of domain ownership, issue whois against a specific name server. the following code block shows the command issued with three widely used whois servers:
# whois domain-name@whois.internic.net # whois domain-name@whois.register.com # whois domain-name@whois.geektools.com
If you can't get the result you need from one of these servers, and you're looking for a site in a different TLD, find the whois server for that domain's registrar of record. If you query that whois server, you should get the information you seek.
If your DNS server is up and running, everything may seem to be fine. However, if you add a new alias or address record at a later point, you may find that it won't load, no matter how many times you reload the zone files. Everything may seem to be in order, but clearly there is a problem. Bryan Bailey, a Rackspace Linux support sysadmin and RHCE, suggests the following approach.
Zone files are quite prone to user error. Think about the unusual syntax of entries in this file. You must use this syntax exactly when you add a new record, or the record will not load. As in the previous example, the trailing dot is the most common zone file omission.
The zone file shown here contains a CNAME record that will not load because it has a missing dot:
$TTL 38400 foo.com. IN SOA ns.foo.com. hostmaster.foo.com.( 2003123166 10800 3600 604800 38400 ) foo.com. IN NS ns.foo.com. foo.com. IN A 192.168.0.1 www IN CNAME foo.com. mail IN CNAME foo.com. pop3 IN CNAME foo.com. smtp IN CNAME foo.com. ftp IN CNAME foo.com. mysubdomain IN CNAME foo.com
While named will reload the zone file without error, the added entry will never resolve. Instead, this entry will create the FQDN mysubdomain. foo. com.foo.com.
To fix this problem, just open the file in a text editor and add the dot to the final entry. Save the file, exit, and restart the service. Try to make a habit of checking the last line in every zone record to ensure that the trailing dot is there. For some reason, it's the final line that always seems to be the culprit.
Allen Rouse, a Rackspace Linux support sysadmin who does a lot of DNS troubleshooting, offers the following tip for easy automation.
Problems like the missing trailing dot example as well as malformed PTR records, bad SOAs, and many other zone file abnormalities and typos can be a real pain to track down. Rackspace sysadmins often use a special tool to automatically scan for and detect these zone file problems after making a zone file change but before restarting the customer's name server (to keep it from crashing on a bad zone file).
Try the DNS administrator tool dlint (www.domtools.com/dns/dlint.shtml). It's worth its weight in gold for the busy DNS administrator.
Sometimes administrators can't solve a DNS problem simply because they don't know where to find the right tool. There are a number of useful DNS troubleshooting tools on the web. If you can't find the answer on one of these sites, you should at least be able to find links to other resources that might solve your problem:
Traceroute tools: www.traceroute.org
Various web tools: http://geektools.com/
InterNic Whois: www.internic.net/whois.html
Graphical traceroute: www.visualroute.com/server.html