Performance: Different methods for testing string input for numeric values...

So there was a blog entry about the VB .NET IsNumeric function today.  The question was in regards to a C# equivalent.  I have two things to say really.  First, if you really want the IsNumeric function from VB you can always grab it out of Microsoft.VisualBasic.dll.  Everything from VB .NET is in that library and you can use any of it that you want.  I made explicit use of their array redimensioning code a while back until I found out that I could write faster code in C#.  In some cases I just assumed the VB team wrote performant methods to back all of their functions, but in some cases the performance is lacking.

So how many different ways can you test for a number?  Well, I've picked 5 methods.  4 are available through the CLR.  The 5th is a hand-made algorithm that is meant for high speed ASCII only processing.  So here are the 5 methods and their functions:

  • IntParse - This method uses the int.Parse and traps the exception.  This is the slowest method of all 5 because int.Parse throws and we know that exceptions are costly.  In this case they are so costly, that the method is several orders of magnitude slower than the other routines.
  • HandCodeSwitch - This is a method that I created using my own performance knowledge.  For instance, I made use of a switch statement to mask out digits since I know this is extremely fast when working over character data.  I know that behind the scenes the switch statement is causing a single subtraction and then either a jump into the digits portion of the case statements, or is jumping directly to the default clause and exiting the method.
  • HandCodeIf - Now, since we are doing the sub, it might be faster to do an if statement.  I've ordered the if to check > '9' first, since in most cases we'll see characters above 0x39 rather than characters below 0x30, followed by < '0'.  This performs a bit faster than the switch statement because we know the ordering of things.  However, the switch statement will be much faster if we add more characters that aren't sequentially ordered (perhaps the decimal point, or if we use a culture comparer to build things, any number of other *numeric* characters).
  • IsNumber - This uses the CLR char.IsNumber method.  This checks for digits and hex digits.  Have to be careful with this.
  • IsDigit - This is more like what the rest of the tests.  This uses the CLR char.IsDigit and only pulls through on digit characters.
  • RegexDigit - This uses a regular expression.  If anyone wants to come up with something faster than what I've created then so be it.  I'll insert it into the test.  As it stands a Regex is 6x slower than the CLR methods and over 13x slower than my hand generated methods.

Okay, so the tests aren't 100% fair since all of the CLR versions are using culture comparers and all sorts of other things, but I'm trying to show you how to write the faster integer routine in the West (as in US!).  The HandCodeIf and HandCodeSwitch win hands down.  If you need to create a method for a culture comparer that took into account things like decimal points, group separators, or any number of other issues, I would highly recommend using the switch model, and dynamically generating the method using what I've written as a template.  If there is any interest I'll show you how you can write such a model in another entry.  It would be just as fast as the current HandCode'ed methods, only it would take into account the additional characters when determining if something was a number or not.  Below is the code.  Enjoy!

using System;
using System.Text.RegularExpressions;

public class IsNumeric {
    private static void Main(string[] args) {
        int iterations = 10000000;
        bool makeSureTheJitDoesNotOptimizeMeOut = false;
        string[] testStrings = new string[] { "1234M", "12345" };
       
        Console.WriteLine("IsNumber String 1: {0}", IsNumber(testStrings[0]));
        Console.WriteLine("IsNumber String 2: {0}", IsNumber(testStrings[1]));
        Console.WriteLine("IsDigit String 1: {0}", IsDigit(testStrings[0]));
        Console.WriteLine("IsDigit String 2: {0}", IsDigit(testStrings[1]));
        Console.WriteLine("HandCodeSwitch String 1: {0}", HandCodeSwitch(testStrings[0]));
        Console.WriteLine("HandCodeSwitch String 2: {0}", HandCodeSwitch(testStrings[1]));
        Console.WriteLine("HandCodeIf String 1: {0}", HandCodeIf(testStrings[0]));
        Console.WriteLine("HandCodeIf String 2: {0}", HandCodeIf(testStrings[1]));
        Console.WriteLine("RegexDigit String 1: {0}", RegexDigit(testStrings[0]));
        Console.WriteLine("RegexDigit String 2: {0}", RegexDigit(testStrings[1]));
        Console.WriteLine("IntParse String 1: {0}", IntParse(testStrings[0]));
        Console.WriteLine("IntParse String 2: {0}", IntParse(testStrings[1]));
       
        DateTime start, end;
        start = DateTime.Now;
        for(int i = 0; i < iterations; i++) {
            makeSureTheJitDoesNotOptimizeMeOut = IsNumber(testStrings[i%2]);
        }
        end = DateTime.Now;
        Console.WriteLine("IsNumber {0}", end - start);

        start = DateTime.Now;
        for(int i = 0; i < iterations; i++) {
            makeSureTheJitDoesNotOptimizeMeOut = IsDigit(testStrings[i%2]);
        }
        end = DateTime.Now;
        Console.WriteLine("IsDigit {0}", end - start);

        start = DateTime.Now;
        for(int i = 0; i < iterations; i++) {
            makeSureTheJitDoesNotOptimizeMeOut = HandCodeIf(testStrings[i%2]);
        }
        end = DateTime.Now;
        Console.WriteLine("HandCodeIf {0}", end - start);

        start = DateTime.Now;
        for(int i = 0; i < iterations; i++) {
            makeSureTheJitDoesNotOptimizeMeOut = HandCodeSwitch(testStrings[i%2]);
        }
        end = DateTime.Now;
        Console.WriteLine("HandCodeSwitch {0}", end - start);

        start = DateTime.Now;
        for(int i = 0; i < iterations; i++) {
            makeSureTheJitDoesNotOptimizeMeOut = RegexDigit(testStrings[i%2]);
        }
        end = DateTime.Now;
        Console.WriteLine("RegexDigit {0}", end - start);

        start = DateTime.Now;
        for(int i = 0; i < iterations; i++) {
            makeSureTheJitDoesNotOptimizeMeOut = IntParse(testStrings[i%2]);
        }
        end = DateTime.Now;
        Console.WriteLine("IntParse {0}", end - start);
        Console.WriteLine(makeSureTheJitDoesNotOptimizeMeOut);
    }
   
    public static bool IntParse(string test) {
        try {
            int.Parse(test);
            return true;
        } catch { return false; }
    }
   
    public static bool HandCodeSwitch(string test) {
        for(int i = 0; i < test.Length; i++) {
            switch(test[i]) {
                case '0':
                case '1':
                case '2':
                case '3':
                case '4':
                case '5':
                case '6':
                case '7':
                case '8':
                case '9':
                    continue;
                default:
                    return false;
            }
        }
       
        return true;
    }

    public static bool HandCodeIf(string test) {
        for(int i = 0; i < test.Length; i++) {
            if ( test[i] > '9' || test[i] < '0' ) {
                return false;
            }
        }
       
        return true;
    }
   
    public static bool IsNumber(string test) {
        for(int i = 0; i < test.Length; i++) {
            if ( !char.IsNumber(test[i]) ) {
                return false;
            }
        }
       
        return true;
    }
   
    public static bool IsDigit(string test) {
        for(int i = 0; i < test.Length; i++) {
            if ( !char.IsDigit(test[i]) ) {
                return false;
            }
        }
       
        return true;
    }
   
    private static Regex matchString = new Regex("^[0-9]+$", RegexOptions.Compiled);
    public static bool RegexDigit(string test) {
        return matchString.IsMatch(test);
    }
}

Published Monday, March 29, 2004 2:16 AM by Justin Rogers

Comments

Monday, March 29, 2004 5:18 AM by Justin Rogers

# re: Performance: Different methods for testing string input for numeric values...

The sample program had a single naming issue. I fixed this, but only after there was a reported view on the source sample. If that someone has tried to run the code, sorry for the mix-up, the new code in the entry should run as planned.
Monday, March 29, 2004 5:38 AM by David Levine

# re: Performance: Different methods for testing string input for numeric values...

I haven't done any performance testing on this, but you should add double.TryParse() to your test suite - it avoids the problem that int.Parse() has of throwing an exception if it fails.
Monday, March 29, 2004 5:48 AM by Justin Rogers

# re: Performance: Different methods for testing string input for numeric values...

Adding performance output for everything except the double.TryParse()

IsNumber String 1: False
IsNumber String 2: True
IsDigit String 1: False
IsDigit String 2: True
HandCodeSwitch String 1: False
HandCodeSwitch String 2: True
HandCodeIf String 1: False
HandCodeIf String 2: True
RegexDigit String 1: False
RegexDigit String 2: True
IntParse String 1: False
IntParse String 2: True
IsNumber 00:00:03.2146224
IsDigit 00:00:02.4635424
HandCodeIf 00:00:00.7410656
HandCodeSwitch 00:00:00.7811232
RegexDigit 00:00:13.3391808
IntParse 00:08:32.7372800
Monday, March 29, 2004 5:50 AM by Justin Rogers

# re: Performance: Different methods for testing string input for numeric values...

Adding code and timing for TryParse on System.Double.

start = DateTime.Now;
for(int i = 0; i < iterations; i++) {
makeSureTheJitDoesNotOptimizeMeOut = double.TryParse(testStrings[i%2], NumberStyles.Number, null, out result);
}
end = DateTime.Now;
Console.WriteLine("DoubleTryParse {0}", end - start, makeSureTheJitDoesNotOptimizeMeOut);


DoubleTryParse 00:00:13.2490512
Monday, March 29, 2004 7:10 AM by Justin Rogers

# re: Performance: Different methods for testing string input for numeric values...

Adding code and timing for Microsoft.VisualBasic.Information.IsNumeric.

start = DateTime.Now;
for(int i = 0; i < iterations; i++) {
makeSureTheJitDoesNotOptimizeMeOut = Information.IsNumeric(testStrings[i%2]);
}
end = DateTime.Now;
Console.WriteLine("Information.IsNumeric {0}", end - start, makeSureTheJitDoesNotOptimizeMeOut);

Information.IsNumeric 00:00:17.1746960
Friday, April 16, 2004 3:32 AM by Tommy

# re: Performance: Different methods for testing string input for numeric values...

for(int i=text.Length-1; i>=0; i--) is faster than for(int i=0; i<=text.Length; i++) because it has to access the .Length-property only once. And in this case, the direction isn't really important.
Wednesday, June 16, 2004 5:52 AM by green raisin

# re: Performance: Different methods for testing string input for numeric values...

i like
Wednesday, June 16, 2004 5:53 AM by mini e-bike

# re: Performance: Different methods for testing string input for numeric values...

yes
Wednesday, June 16, 2004 6:15 AM by foot scooter

# re: Performance: Different methods for testing string input for numeric values...

thank you
Wednesday, June 16, 2004 6:16 AM by electric motorcycle

# re: Performance: Different methods for testing string input for numeric values...

good
Friday, October 08, 2004 12:41 AM by TrackBack

# re: I was not aware of that...IsNumeric in C#

Wednesday, May 23, 2007 2:36 PM by Brian

# re: Performance: Different methods for testing string input for numeric values...

I would be curious to know what the performance would look like if all input was valid. (i.e. int.Parse was valid at all times)

Saturday, October 13, 2007 4:24 AM by Rene

# re: Performance: Different methods for testing string input for numeric values...

What if the number is negative? (i.e. "-12345")

Is the HandCodeIf still the fastes then, if you add a check for the first char (i.e. test[1] == "-")

Friday, May 02, 2008 4:15 AM by comparers trackback url

# comparers trackback url

Pingback from  comparers trackback url

Leave a Comment

(required) 
(required) 
(optional)
(required)