Mon, Aug 18, 2008

DNS Hell

Posted to Technology category

I spent most of the weekend fighting what turned out to be some really annoying interactions between VMWare, DHCP-Client on Linux, and the resolver DNS functionality. But I have finally achieved a state where all four of the 'machines' are linked by smb and the two Linux 'machines' are linked by NFS.

I also spent an unpleasantly large of yesterday evening, early this morning and later this morning (after sleeping) repairing a major deletion of installed packages on my host system (due to my clicking yes at the wrong time because I was tired). Fortunately, I did a full backup of the current /home, and /etc over Saturday night/Sunday morning, including a capture of the installed package list, so recovery was straightforward, just slow.

Some of my attempted fixes had domino effects. At the moment, yrhel5 (the RedHat image), is just running off its hosts file, not real DNS, but that is enough for it to connect to the other test images. The fact that it joined the domain before its DNS went down seems to be enough for Samba to communicate.

The original problem I was struggling with was that network configuration information in the environment hosting my vmware installation kept getting stepped on and reverting to bad values, usually at boot time, but also at other times. Because of this, the machine was reporting an incomplete value for its own ID, and this was preventing it from joining the Active directory domain. I now have the IP addresses of my ISP's DNS servers memorized, becaues I have typed them so many times.

Setting the IP to static instead of DCHP didn't help: the resolv.conf and hosts data just got cleared instead of set to bad values. I am not entirely sure where all of the places are that this is coming from, but I strongly suspect that vmware is treating the host system as a dhcp client even though it has a fixed IP address, and then the dhcp client software on ykchaua fills in the files with the (empty) data from the vmware dhcp servers.

I have installed two packages that seem to be helping with this. One is called resolvconf and is supposed to handle the resolv.conf file in a more structured way. I'm not sure at this point whether it is doing any good, but it does not seem to be making things any worse.

The second package, which does seem to be helping, is called dnsmasq. It sets up a small cacheing DNS server in a Linux system, using /etc/hosts as data, and you can tell it to ignore resolve.conf and use a different file to define the upstream DNS servers. It can also act as a DHCP server, and the DNS piece knows about (and can provide DNS mappings for) any machines that get their IP addresses from the DHCP server side. I'm not using the DHCP server piece at the moment: I'm trying very hard to get DHCP out of the picture as much as possible to get things stabilized.

I am going to shut down everything tonight. It will be interesting to see what breaks when I boot back up in the morning. Once all the images are talking to each other again, I will load the Rational tools from the release areas I set up today and actually begin developing and testing the software I need to be working on.

In the meantime, I'm going to google for more information about dnsmasq. Maybe there are hints about using it with VMWare.

Posted at: 9:47 pm MDT

trackback || 0 comments

Comments

Your Comment

 
Name:
URL/Email: [http://... or mailto:you@wherever]
Title: (optional)
To help stop comment spam, please enter the first word of the posting category.
Category:
Comment:
Save my Name and URL/Email for next time
Return to main