BradVin's .Net Blog

Code, snippets, controls, utils, etc. Basically all things .net

ÜberUtils - Part 3 : Strings

√úberUtils Series posts so far :

So every developer has (or should have) a utilities class for strings. It seems the built-in string class never has enough (well for me in any case). So I hereby introduce my string utils class. It actually comprises of 3 files which are :

  1. Strings.cs (the actual string utils)
  2. SafeConvert.cs (a class for doing common conversions)
  3. Extensions/Strings.cs (extension methods using the string utils)

Here is the class diagram of the Strings class :

 

As you can see it has a nested class Regex which is also static. More on this later. Lets cover the string utility methods first (in 'logical' order):

  • IsEmpty - returns true if the object passed in is either null or has a length of zero (exactly like string.isNullOrEmpty but can take an object as input)
  • IsNumeric - returns true if we are dealing with a numeric value. Uses the regular expression : @"^\-?\(?([0-9]{0,3}(\,?[0-9]{3})*(\.?[0-9]*))\)?$". This matches a positive or negative value with any precision and scale (whole number or decimal). It also allows for left-padded zeros, commas as group separators or parenthesis to indicate negative number
  • IsEmail - returns true if an email. Uses the regular expression : @"([\w-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)"
  • Trim - exactly like "abc".Trim() but adds checking for nulls
  • CutWhitespace - cuts all whitespace from a string aswell as trims it
    • eg. Strings.CutWhitespace(" 12  34   5 6  7   ") == "12 34 5 6 7"
  • CutEnd - chops the end n chars off the end of a string
    • eg. Strings.CutEnd("1234567890", 3) == "1234567"
  • CutStart - chops the first n chars off the beginning of a string
    • eg. Strings.CutStart("1234567890", 3) == "4567890"
  • Start - returns the first n chars of a string
    • eg. Strings.Start("1234567890", 3) == "123"
  • End - returns the last n chars of a string
    • eg. Strings.End("1234567890", 3) == "890"
  • GetOccurences - returns an array of strings that are found within another string based on a regular expression
    • eg. Strings.GetOccurences("say day bay toy", "[sdbt]ay") == new string[] {"say" , "day" , "bay"}
    • eg. Strings.GetOccurences("123asdasd 1sk 555 sdkfjsdfkl999", "\\d+") == new string [] {"123" , "1" , "555" , "999"}
  • OccurenceCount - returns the count of strings found within another string based on a regular expression
    • eg. Strings.OccurenceCount("the cat sat on the mat", "at") == 3
    • eg. Strings.OccurenceCount("abcabc", "a") == 2
  • Combine - combines a string array by a delimeter (or not) (DEPRICATED - read update and comments)
    • eg. Strings.Combine(Strings.GetOccurences("123asdasd 1sk 555 sdkfjsdfkl999", "\\d+"), ",") == "123,1,555,999"
    • eg. Strings.Combine(new string[] { "a", "b", "c", "d" }, ";") == "a;b;c;d"
  • ToPaddedNumber - returns a zero padded number (DEPRICATED - read update and comments)
    • eg. Strings.ToPaddedNumber("123", 5) == "00123"
  • XOR - performs a binary XOR operation on each char in the input string based on a key. Very simple form of encryption where XOR(XOR(input)) == input
    • eg. Strings.XOR(Strings.XOR("test", "key"), "key") == "test"
  • ToTitleCase - returns the title case of a string
    • eg. Strings.ToTitleCase("this is a title") == "This Is A Title"
  • ToFriendlyName - returns what I call a "friendly" version of a string. I use this mainly for converting a database field name into a user friendly name
    • eg. Strings.ToFriendlyName("IAmNotFriendly") == "I Am Not Friendly"
    • eg. Strings.ToFriendlyName("SomePrimaryKeyId") == "Some Primary Key"

Now onto the Regex class. The static Regex class just wraps regular expression functionality and contains a few commonly used expressions as constants. Here is the run down :

  •  IsExactMatch - returns true if a string is an exact match for a pattern
    • eg. Strings.Regex.IsExactMatch("test@google.com", Strings.Regex.REGEX_EMAIL) == true
  • Contains - returns true if a string contains a pattern
    • eg. Strings.Regex.Contains("here is my email : test@google.com", Strings.Regex.REGEX_EMAIL) == true
  • Replace - returns a string with a pattern replaced by another string
    • eg. Strings.Regex.Replace("1 23 a 456", @"\d+", "!") == "! ! a !"
  • GetMatch - returns the first match of pattern within a string
    • eg. Strings.Regex.GetMatch("Subject: Test Subject\r\n", @"Subject\s*\:\s*(?<SubjectReturn>.*)\r\n", "SubjectReturn") == "Text Subject"

Now onto the SafeConvert class. It contains the following methods :

  • ToBoolean - returns a boolean value from an object
  • ToInt - returns an integer value from an object
  • ToDecimal - returns a decimal value from an object
  • ToDouble - returns a double value from an object
  • ToHexString - returns a hexidecimal string representation of a byte array. This is used from Extensions\ByteArray.cs
  • ToStream - returns a System.IO.MemoryStream from a string

So thats version 1 of the strings utilities. I say version 1 because I will no doubt add to this over the next couple of posts.

Oh yes, and again we have a whole bunch of new extension methods :

  • Start
  • End
  • CutStart
  • CutEnd
  • OccurenceCount
  • GetOccurences
  • ReplaceAll - similar to Replace, but uses a regular expression to do the replacement
  • Split - similar to Split(char c) but takes a string pattern to split using regular expressions
  • Combine (DEPRICATED - read update and comments)
  • Join - an extension method for string arrays wrapping the string.Join method

Now I know some people might argue that this is extension method abuse, but look at how much more power my strings have :

 

... and anything that helps me code quicker and smarter is not abuse in my book - its smart coding!

Download the source code and unit tests here

UPDATE - thanks to Dan's comments we found a bug in the email regular expression whereby it would not allow the domain ".museum" so I changed the regex to
@"([\w-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,8}|[0-9]{1,8})(\]?)" (changes in bold)
Please note that email validation seems to be a touchy point for many developers as can be seen over at haaked.com . I would suggest not to use ANY email validation like this for restricting comments or purchases online, as you would be limiting your site's reach. Source code and unit tests have been updated.

UPDATE - thanks to Scott Hanselman for pointing out that ToPaddedNumber is redundant as the string class has a PadLeft (as well a PadRight) method - DOH! Source code and unit tests have been updated.

UPDATE - thanks to Don and John for pointing out the fact that my Combine method is redundant as the string.Join method does the exact same thing. - oops ;)
I then renamed my extension method Combine to Join and changed it to wrap the string.Join functionality. Again Source and tests have been updated.

NOTE - I renamed the static extension classes so that you could include both the Utils and Utils.Extensions namespaces without getting the build error : 'Strings' is an ambiguous reference between 'Utils.Strings' and 'Utils.Extensions.Strings'. Please get the latest source.


Thanks for all comments and feedback and please keep it coming. Collaboration and a LOT of testing is the only way to produce robust,useful code!

Published Monday, October 22, 2007 8:52 PM by bradvin
Filed under: , , ,

Comments

# re: ÜberUtils - Part 3 : Strings@ Tuesday, October 23, 2007 4:05 AM

Nice Utils!

# re: ÜberUtils - Part 3 : Strings@ Tuesday, October 23, 2007 7:18 AM

Very nice utils indeed. And with .NET 3.5 these will be even more great, because then you won't need your own class anymore if you rewrite it correctly :)

by JV

# re: ÜberUtils - Part 3 : Strings@ Tuesday, October 23, 2007 9:31 AM

Nice work. Thanks for sharing!

by Avery

# re: ÜberUtils - Part 3 : Strings@ Tuesday, October 30, 2007 1:12 AM

Why is ToPaddedNumber there, rather than just using string.PadLeft? Just curious.

# re: ÜberUtils - Part 3 : Strings@ Tuesday, October 30, 2007 2:54 AM

you're absolutely right Scott - why is ToPaddedNumber there? I have taken it out and updated the article and code. Thanks for the feedback

by bradvin

# re: ÜberUtils - Part 3 : Strings@ Wednesday, October 31, 2007 10:29 AM

Just a quick note. Your String.Combine() is very similar to String.Join().

by Don McNamara

# re: ÜberUtils - Part 3 : Strings@ Wednesday, October 31, 2007 12:26 PM

What's the difference between your String.Combine and .NET's String.Join?  Except for the order of the parameters, they appear to be the same.

And I agree 100% about this being smart coding not abuse.  Thanks!

# re: ÜberUtils - Part 3 : Strings@ Wednesday, October 31, 2007 1:25 PM

Thanks John and Don - there is no need for the Combine method, as Join does the exact thing. I thought it was there somewhere ;)

by bradvin

# re: ÜberUtils - Part 3 : Strings@ Wednesday, October 31, 2007 4:23 PM

Great post - one very useful method that I have in my string helper class is a "Join" method that allows you to pass in an IEnumerable rather than only string[].  Here is the implementation -

public static string Join(string seperator, IEnumerable enumerable) {

    StringBuilder builder = new StringBuilder();

Join(builder, seperator, enumerable);

    return builder.ToString();

    }

public static void Join(StringBuilder builder, string seperator, IEnumerable enumerable) {

if (enumerable == null) return;

bool isFirst = true;

foreach (object s in enumerable) {

if (!isFirst) {

builder.Append(seperator);

}

isFirst = false;

builder.Append(s);

}

}

# re: ÜberUtils - Part 3 : Strings@ Sunday, October 26, 2008 9:13 PM

Thanks for the great share, it would have been better if you had included a instrctions manual for those of us who are new.

I am looking forward to help in this regard.

by Baraka