IIS7 Search Engine Optimization Toolkit does not like HTML5 Doctype

The IIS7 Search Engine Optimization (SEO) Toolkit is an awesome tool to review your site and indicate any problems with the html that may cause search engines not to index your website properly.  Check out Scott Guthrie’s post on the IIS7 Search Engine Optimization Toolkit or you can download it from the IIS website at the tool’s home: IIS7 SEO Toolkit.

But I ran into an issue with the tool the other day where it was not returning the proper results.  I ran it against a website I was working on and it only crawled one page and returned results that were just not correct:


The errors: “The title is missing.”, “The description is missing.”, and “The <h1> tag is missing.” were incorrect since I had all of those on the home page of the site I was testing, but also it was curious that it was not crawling any other urls off of that home page.

AND the content tab even showed all of these things in the html that the tool had retrieved:


Eventually I was able to track it down to using the HTML5 Doctype tag:

<!DOCTYPE html>

The HTML5 document type is not official yet but I have gone ahead and started using as recommended by many developers but apparently the IIS7 SEO Toolkit does not like this doctype yet.  Temporarily I switched it to another doctype:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

And the SEO Toolkit worked as expected.  Just a heads up for any other developers that run into this when using HTML5 DOCTYPE.

UPDATE (2009-12-07):

Dave Cox added a comment about it not being the HTML5 DOCTYPE per say but the whitespace that comes before it.  Having a line of whitespace AND the HTML5 DOCTYPE causes the IIS7 SEO Toolkit not to read the content correctly.



I am using ASP.NET MVC in this project so that means my master page will look like this; the DOCTYPE declaration on the same line as the Master page directive:



Comments have been disabled for this content.