Windows Server 2008 R2 DNS Issues
I recently upgraded my home Windows Server 2008 Domain Controller to R2. The upgrade process itself wasn’t too much work but was a bit more than ‘next, next, finish’ because the AD schema needed to be updated and the installer required that WSUS be uninstalled first. But, those weren’t a big deal.
However, after the install, I got the strangest behavior. Visiting some websites like www.microsoft.com, www.bing.com, www.windowsupdate.com and a number of other Microsoft websites didn’t work. However, other websites worked perfectly. In fact, www.google.com still worked. It’s almost as if Microsoft decided they didn’t want to grow their search engine market share anymore and would start blocking their visitors. :)
What made it even more confusing was that if I viewed the errors in my browser, it timed out and gave a DNS error. However, if I pinged the DNS name, it worked.
(feel free to skip to the bottom for the fix if you don’t want to read the details)
I did some searching and didn’t find an answer (although now that I know what search terms to look for, I see that others have run into this now). I tried all the basic troubleshooting methods to no avail.
I skimmed some R2 release notes I found and I saw that there were EDns (EDNS0) changes with R2 but it was pretty vague. EDns is a relatively new DNS protocol extension that is still coming of age. Later I realized that I was on to something here.
I realized that I would need to fire up Network Monitor to get the story. After running Network Monitor, an issue was immediately apparent as seen from the following screen shot snippet:
First, I wondered why my search for bing.com returned search.ms.com.edgesuite.net. The answer to that wasn’t hard to find. Those are the DNS names of the Akamai CDN which Microsoft uses for a lot of their sites. The real issue there is the “Response – Format error”.
I looked at the request and the results for a while and it seemed straight forward, so I did a network trace on a working server and found that R2 added some extra information. Notice the bottom line of the following image with the “AdditionalRecord: of type OPT on class Unknown DNSClass”. The network trace on the working server didn’t have that.
So, I knew at this point that R2 was adding something that the Akamai DNS servers didn’t like. I did a search for OPT and discovered that OPT is used in EDns. I found a registry setting called EnableEDNSProbes which disables EDNS when set to 0. After setting that and restarting the DNS Server service, everything worked perfectly. I set it back again and it stopped working, so I knew I had narrowed it down.
While searching for information on EDns, I discovered that some DNS servers will attempt to make a EDNS probe, and if it fails then it will try again with a plain query. That allows it to always work regardless of the support of the other DNS servers. However, after testing I found that Microsoft DNS doesn’t do that. EDNS can either be ‘on’ or ‘off’. Bummer, I thought that was a good idea.
Testing further I discovered that it’s not enabled by default on Windows Server 2008 RTM. I tried on another R2 server that wasn’t in production yet and confirmed that the issue appeared there too. So, the issue wasn’t that something changed with EDns, it’s simply that it was enabled in R2 for the first time.
The reason that it failed in the web browser but worked with a ping is because the browser followed a redirect and failed on the redirected address and not the original address. The ping didn’t follow the redirect so the failure never occurred.
It appears that the same issue occured when Windows Server 2003 was released: http://support.microsoft.com/kb/832223. I don't remember that occuring and being a big deal so I suspect that Microsoft must have made changes to the default with later service packs or hot fixes.
Conclusion
It appears that the Internet isn’t fully up to date and ready to use EDns quite yet. The solution for this is to disable EDns and wait another year or two until Akamai and other DNS servers catch up, or Microsoft releases a hot fix to support the failback option I mentioned above.
Note that this isn’t a problem for most Windows Server 2008 R2 member servers. It’s only a problem for DNS *servers* that do recursive lookups. i.e. likely only your domain controller will be affected if that is where your DNS Server role exists.
Fix
To disable EDns, you can do it from the command prompt, or by editing the registry.
From the command prompt, no restart of DNS is required. If from the registry, make sure to restart the DNS Server service.
Command prompt:
No restart is needed. It takes effect immediately.
or Registry: </>
Restart the DNS Server service for it to take effect.