I first came across Search Engine Safe (SES) URLs a year or two ago when Erik Voldengen and Bert Dawson created the CF_sesConverter Custom Tag for ColdFusion applications (http://www.fusium.com/index.cfm?fuseaction=home.buildmaster&bodyFuseaction=ses.intro).
This tag would accept a modified Query String, which looked like a reference to a static URL, and convert the parameters and add them to the Query String (URL in ColdFusion) scope. This process made it easier for search engines to spider dynamic Web pages and return them to the user in the search results.
Here’s an example of a standard ASP.NET URL and its SES equivalent.
Standard URL:
http://www.opgirlslearntoride.com/home.aspx?tab=371A6304
SES URL:
http://www.opgirlslearntoride.com/home.aspx/tab/371A6304
I always thought it would be cool if this would be doable in ASP.NET. At first glance, I remembered that the Request.QueryString collection is read-only, thus you cannot add parameters at runtime.
After looking around the Web a bit, I discovered the RewritePath() method of the HttpContext class. This method is used to rewrite the path of the request prior to page processing, and is actually used by ASP.NET in a cookieless session state. Using this method, the URL can be rewritten before being processed by the Page methods and events, thus providing the ability to manipulate the URL Query String at runtime.
Based on this method, an SES URL can be converted into a standard URL using Regular Expressions and some string splitting:
public static string FromSesUrl(string path)
{
Match sesMatch = Regex.Match(
path,
"([\\/].[^\\/]*.aspx)(.*)",
RegexOptions.IgnoreCase);
string sesUrlBase = sesMatch.Groups[1].Value;
string sesUrlParams = sesMatch.Groups[2].Value;
string urlBase = sesUrlBase;
string urlQueryString = String.Empty;
if (sesUrlParams.Trim().Length > 0)
{
sesUrlParams = sesUrlParams.Replace('\\', '/');
string[] urlParams = sesUrlParams.Split('/');
for (int idx = 1; idx < urlParams.Length; idx += 2)
{
if (urlParams[idx].Trim().Length > 0)
{
urlQueryString += (idx == 1) ? "?" : "&";
urlQueryString += urlParams[idx];
urlQueryString += "=";
urlQueryString += (idx + 1 != urlParams.Length) ?
urlParams[idx + 1] : String.Empty;
}
}
}
return Regex.Replace(
path,
"([\\/].[^\\/]*.aspx)(.*)",
urlBase + urlQueryString,
RegexOptions.IgnoreCase);
}
This method works great and quite fast. Using this method, you can enable an ASP.NET application to automatically parse SES URLs and Rewrite them to standard URLs.
However, this is only one piece of the puzzle. Although you can use SES URLs, it does not make sense to go through your application and replace the existing standard URL references with SES URLs. So, in order for this method to be effective, there needed to be a way to automatically convert existing standard URL references to SES URLs. But how can this be done without rewriting source code?
Again, after looking around the Web, I discovered a pretty cool property of the HttpResponse class – the Filter property. This Stream instance wraps around the HTTP entity body before transmission to the client. Utilizing this property, I created a custom filter and added some code that would search through the response body and replace standard URL references with SES URLs.
public override void Write(byte[] buffer, int offset, int count)
{
string sBuffer = Encoding.Default.GetString(buffer, offset, count);
MatchCollection hrefMatches = Regex.Matches(
sBuffer,
SesRegexPattern.HrefPattern,
RegexOptions.IgnoreCase);
if (hrefMatches.Count > 0)
{
foreach (Match match in hrefMatches)
{
string href = match.Groups[match.Groups.Count - 2].Value;
if (Regex.IsMatch(href, SesRegexPattern.AspxPattern))
{
href = href.Replace(href, SesUrlUtil.ToSesUrl(href));
}
if (!Regex.IsMatch(
href,
SesRegexPattern.HttpProtocolPattern))
{
if (!Regex.IsMatch(
href,
SesRegexPattern.AbsolutePathPattern))
{
href = Regex.Match(
this.Context.Request.Path, SesRegexPattern.CurrentPathPattern)
.Groups[1].Value + href;
}
sBuffer = sBuffer.Replace(
match.Value,
match.Value.Replace(match.Groups
[match.Groups.Count - 2].Value, href));
}
}
}
byte[] bufferNew = Encoding.Default.GetBytes(sBuffer);
this.BaseStream.Write(bufferNew, 0, bufferNew.Length);
}
This method works like a charm, and is also quite fast.
Finally, in order to tie these pieces of functionality together, I created an HttpModule that handled the BeginRequest event of the HttpApplication class. This event handler first rewrites the SES URL to its standard URL and then adds the custom filter to the Response object of the current context.
It is also good to note that although this filter will rewrite SES URLs, it will allow standard URLs to pass-through without being altered. Thus, implementing SES for ASP.NET is as simple as two easy steps:
1) Add the Boardworks.Utilities.SearchEngineSafe.dll to your /bin directory of the ASP.NET application you wish to implement
2) Modify the application’s Web.config file to include the following code:
<httpModules>
<add name="SesHttpModule"
type="Boardworks.Utilities.SearchEngineSafe.SesHttpModule,
Boardworks.Utilities.SearchEngineSafe"/>
</httpModules>
To see this module in action, check out the following site:
http://www.opgirlslearntoride.com
If you have any comments, or better ideas for SES URLs in ASP.NET, please drop me line! Also, if you would like a copy of this project, please shoot me an email. If there are enough requests, I will post the Visual Studio .NET Project to this post.
UPDATE:
There's a new version of this module, which includes a small fix to some errant debug code. NOTE: This link was previously broken, and has been fixed - sorry about that!
http://www.scottvanvliet.com/downloads/ASP_NET_SES_1_0_2.zip
As always, feedback on this would be greatly appreciated!