Skip to main content

The Internet is down. Or, is it?(PART 2)

Other Name Server Problems
Another common nslookup error you might run into is this:

$ nslookup web1
Server:      10.2.2.2
Address:     10.2.2.2#53

** server can't find web1: NXDOMAIN
Here my name server at 10.2.2.2 responded to me but told me it couldn't find the record for server web1. This error could mean that I don't have web1's proper domain name in my DNS search path. If you don't specify a host's fully qualified domain name (for instance, web1.mysite.com) but instead use the shorthand form of the hostname, your system will check /etc/resolv.conf for domains in your DNS search path. It then will add those domains one by one to the end of your hostname to see if it resolves. The DNS search path is the line in /etc/resolv.conf that starts with the word search:
search example.net example2.net
nameserver 10.2.2.2

In my case, when I search for web1's IP address, my system will first search for web1.example.net, and if that has no records, it will search for web1.example2.net. If you want to test whether this is the problem, simply run nslookup again but with the fully qualified domain name (such as web1.mysite.com). If it resolves, either make sure you always use the fully qualified domain name when you access that server, or add that domain to the search path in /etc/resolv.conf.
If you try nslookup against the fully qualified domain name and you still get the same NXDOMAIN error above, your problem is with the name server itself. Troubleshooting the full range of DNS server problems is a bit beyond what I could reasonably fit in this column, but here are a few steps to get you started. If you know your DNS server is configured to have the record you are looking for itself, you need to examine its zone records to make sure that particular hostname exists. If, on the other hand, you are searching for a domain for which you know it doesn't have a record (say, www.linuxjournal.com), it's possible your DNS server isn't allowing recursive queries from your host or at all. You can test that by trying to resolve some other remote host on the Internet. If it doesn't resolve, it's probably a recursion setting. If it does resolve, the problem might very well be with that remote site's DNS server.
Test General Internet Routing
If after all these tests you find that your DNS servers are working fine, but you still can't access the remote server, the final step is to perform another traceroute like above, only directly against the remote server. So for instance, if you wanted to test your route to
www.linuxjournal.com, the traceroute might look like the following:
$ traceroute www.linuxjournal.com
traceroute to www.linuxjournal.com (76.74.252.198), 30 hops max,
 ↪60 byte packets
1  10.1.1.1 (10.1.1.1)  1.016 ms  2.222 ms  2.308 ms
2  75-101-46-1.dsl.static.sonic.net (75.101.46.1)  6.916 ms 
 ↪7.389 ms  8.386 ms
3  921.gig0-3.gw.sjc2.sonic.net (75.101.33.221)  11.265 ms 
 ↪12.435 ms  13.050 ms
4  108.ae0.gw.equinix-sj.sonic.net (64.142.0.73)  13.846 ms 
 ↪15.233 ms  15.390 ms
5  GIG2-0.sea-dis-2.peer1.net (206.81.80.38)  35.149 ms 
 ↪36.272 ms  36.944 ms
6  oc48.so-2-1-0.sea-coloc-dis-1.peer1.net (216.187.89.190) 
 ↪37.340 ms  27.884 ms  27.266 ms
7  10ge.ten1-2.sj-mkp16-dis-1.peer1.net (216.187.88.202) 
 ↪28.421 ms  29.014 ms 29.688 ms
8  10ge.ten1-2.sj-mkp2-dis-1.peer1.net (216.187.88.134) 
 ↪30.903 ms  31.015 ms 31.804 ms
9  10ge-ten1-3.la-600w-cor-1.peer1.net (216.187.88.130) 
 ↪40.840 ms  41.279 ms 42.069 ms
10  10ge.ten1-1.la-600w-cor-2.peer1.net (216.187.88.146) 
 ↪42.587 ms  43.710 ms 44.921 ms
11  10ge-ten1-2.dal-eqx-cor-1.peer1.net (216.187.124.122) 
 ↪81.702 ms  82.959 ms 83.934 ms
12  10ge-ten1-1.dal-eqx-cor-2.peer1.net (216.187.124.134) 
 ↪74.876 ms  72.454 ms 72.798 ms
13  10ge-ten1-3.sat-8500v-cor-2.peer1.net (216.187.124.178) 
 ↪80.224 ms  81.872 ms  82.569 ms
14  216.187.124.110 (216.187.124.110)  83.499 ms  84.162 ms 
 ↪85.048 ms
15 
www.linuxjournal.com (76.74.252.198)  85.484 ms  86.461 ms 
 ↪87.153 ms

In this example, I'm 15 hops (or routers) away from the www.linuxjournal.com server. This is an example of a successful query, but if you ran the same query and noticed a number of rows of asterisks that never made it to your destination and you couldn't ping www.linuxjournal.com directly, the problem could be an Internet routing issue between you and the remote network. Unfortunately, it's probably something outside your control, but fortunately, these sorts of problems tend to resolve themselves pretty quickly, so just keep trying.
If, on the other hand, your traceroute command was successful, but the remote site still didn't work, go back to the steps I discussed in my previous column on how to use telnet and nmap to test whether a remote port is open. It actually could be that the remote server is down (hey, it happens to the best of us) or that someone has configured a firewall to block you from that remote server.
I hope this series has kindled (or rekindled) your interest in troubleshooting under Linux. One of the things I love about Linux is how little it hides from you about how it works and how many troubleshooting tools it provides when things do go wrong. If this has piqued your interest, there are many more troubleshooting avenues for you to explore—from DNS servers like I mentioned above, to troubleshooting just about any type of service. Also, if you have any other great tools or techniques you use to track down these problems, drop me a line. I'm always on the lookout for tools to solve problems faster.

Comments

Popular posts from this blog

[AIX] How to restart network service in AIX environment?

When network service like telnet connection hangs, then it is possible the inetd subsystem is not working properly. # refresh -s inetd 0513-095 The request for subsystem refresh was completed successfully. If the refresh hangs for some time and comes back with 0513-056 time out waiting for command response, then the inetd subsystem may not be working correctly and should then be killed nicely. Run #ps -ef | grep inetd and do a kill -15 on the process ID on the line that has /usr/sbin/inetd. Once inetd has been killed, type startsrc -s inetd. If inetd starts, try to telnet into the machine. If inetd does not start up successfully, or if telnet still hangs indefinitely, run kill -15 on the PID of inetd again. Back up the original /etc/inetd.conf file to a new file name by typing: # mv /etc/inetd.conf /etc/inetd.conf.backup The original template is in /usr/lpp/bos.net/inst_root/etc/. You can copy it by typing: # cp /usr/lpp/bos.net/inst_root/etc/inetd.conf /etc/inetd.conf Run start...

Configure Link based IPMP in Solaris

For long we used Tracking IP based IPMP wherein we track the availability of a gateway on the network using ICMP Echo request. When the gateway IP fails to respond it is considered link is unavailable and NIC fails over to the standby NIC in the IPMP group. The biggest disadvantage here is that we use 3 IP addresses: 1 for the Virtual IP 1 for the active NIC 1 for the Standby NIC and ofcourse the overhead of the ICMP echo requests sent every seconds. This can be overcome using the Link based IPMP configuration where you only need only IP Address and there is no overhead of ICMP messages and the failover delay is lot lesser than that of the tracking method. And the added advantage of a very simple configuration. To configure Link based IPMP, create the hostname.<int name> files for the Active NIC and the standby NIC. For instance here, we use the bge0 and bge3 NICs as the IPMP pair where bge0 is active and bge3 is standby and hence the files hostname.bge0 and hostname.bge3 . To...

Windows Server Backup Step-by-Step Guide for Windows Server 2008

The Windows Server Backup feature provides a basic backup and recovery solution for computers running the Windows Server® 2008 operating system. Windows Server Backup introduces new backup and recovery technology and replaces the previous Windows Backup (Ntbackup.exe) feature that was available with earlier versions of the Windows operating system. What is Windows Server Backup? The Windows Server Backup feature in Windows Server 2008 consists of a Microsoft Management Console (MMC) snap-in and command-line tools that provide a complete solution for your day-to-day backup and recovery needs. You can use four wizards to guide you through running backups and recoveries. You can use Windows Server Backup to back up a full server (all volumes), selected volumes, or the system state. You can recover volumes, folders, files, certain applications, and the system state. And, in case of disasters like hard disk failures, you can perform a system recovery, which will rest...