Sunday, April 04, 2010 7:18 AM
Tanzim Saqib
Improve Performance of char.IsWhiteSpace for ASCII inputs in .NET 3.5
IsNullOrWhiteSpace is a new method introduced in string class in .NET 4.0. While this is a very useful method in string based processing, I attempted to implement it in .NET 3.5 using char.IsWhiteSpace(). I have found significant performance penalty using this method which I replaced later on, with my version.
The following code takes about 20.6074219 seconds in my machine whereas my implementation of char.IsWhiteSpace takes about 1/4 less time 15.8271485 seconds only. In many scenarios ex. string parsers, this level of performance gain makes a huge huge difference. While I was building a CMS Framework lately, I noticed this difference in string parsing.
static void Main(string[] args)
{
var test1 = " ";
var test2 = " test ";
var start = DateTime.Now;
for (var i = 0; i < 99999999; ++i)
{
var result1 = test1.IsNullOrWhiteSpace();
var result2 = test2.IsNullOrWhiteSpace();
}
var diff = DateTime.Now.Subtract(start);
Console.WriteLine(diff);
Console.ReadLine();
}
Not all applications require Unicode support. If your application expects complete ASCII inputs, the following works way better than the native .NET 3.5 one:
public static class Extensions
{
public static bool IsNullOrWhiteSpaceASCII(this string value)
{
if (value != null)
{
char c;
var len = value.Length;
for (var i = 0; i < len; ++i)
{
c = value[i]; // Instead of char.IsWhiteSpace
if (!(((c != ' ') &&
((c < '\t') ||
(c > '\r'))) &&
((c != '\x00a0') &&
(c != '\x0085'))))
continue;
return false;
}
}
return true;
}
}
Fortunately .NET 4.0 introduces a method, string.IsNullOrWhiteSpace to do just that, in smaller footprint as less as 8 seconds.
Filed under: C#