Escaping/Unescaping XML Data

With as popular as XML is, when creating XML files you need to be able to escape certain characters that will not parse correctly if they are not escaped.  Until recently I always did this like most other .Net programmers, I wrote a function to do it.

    1 public static string EscapeXML(this string s)

    2 {

    3     if (string.IsNullOrEmpty(s)) return s;

    4 

    5     string returnString = s;

    6     returnString = returnString.Replace("'", "'");

    7     returnString = returnString.Replace("\"", """);

    8     returnString = returnString.Replace(">", ">");

    9     returnString = returnString.Replace("<", "&lt;");

   10     returnString = returnString.Replace("&", "&amp;");

   11 

   12     return returnString;

   13 }

Recently I came across a function hidden within the .Net framework that does this within on line of code.  It is located in the System.Security class.  Under the SecurityElement there is a function that called Escape() that takes a string as a parameter.  This basically does what the above function does with one line of code.  So the function above is simplified to the code below.

    1 public static string EscapeXML(this string s)

    2 {

    3     if (string.IsNullOrEmpty(s)) return s;

    4 

    5     return SecurityElement.Escape(s);

    6 }

So now the only question is how this can be an even better solution.  The answer is by using extension methods.  You can create a class called StringExtensions the contains the code below.  I personally have a PCRExtensions project that I put all my extension methods, that way I can use them in any project that needs them.

    1 using System.Security;

    2 

    3 namespace YourExtensions

    4 {

    5     public static class StringExtensions

    6     {

    7         public static string EscapeXML(this string s)

    8         {

    9             if (string.IsNullOrEmpty(s)) return s;

   10 

   11             return !SecurityElement.IsValidText(s)

   12                    ? SecurityElement.Escape(s) : s;

   13         }

   14 

   15         public static string UnescapeXML(this string s)

   16         {

   17             if (string.IsNullOrEmpty(s)) return s;

   18 

   19             string returnString = s;

   20             returnString = returnString.Replace("&apos;", "'");

   21             returnString = returnString.Replace("&quot;", "\"");

   22             returnString = returnString.Replace("&gt;", ">");

   23             returnString = returnString.Replace("&lt;", "<");

   24             returnString = returnString.Replace("&amp;", "&");

   25 

   26             return returnString;

   27         }

   28     }

   29 }

So now that you have the above extension methods written for string objects, you can call them by using them just like any other built in function.  And the nice thing is, they even show up in intellisense so you can just pick the method from the list just as if they were part of the framework.  This is what makes extension methods so powerful.

   1 string xmlUnescapedText = "Ben & Jerry's";

   2 xmlUnescapedText.EscapeXML();  //returns "Ben &amp; Jerry&apos;s"

   3 

   4 string xmlEscapedText = "Ben &amp; Jerry&apos;s";

   5 xmlEscapedText.UnescapeXML();  //returns "Ben & Jerry's"

 
Share This Post

Published Sunday, August 16, 2009 11:08 PM by smehaffie
Filed under: ,

Comments

# re: Escaping/Unescaping XML Data

Monday, August 17, 2009 4:16 PM by Marco D

That was excatly what I was looking for. Very nice tool Shawn.

# re: Escaping/Unescaping XML Data

Wednesday, July 21, 2010 8:15 AM by nnnn

The blog is great. Thank you. But this condition is not well

!SecurityElement.IsValidText(s)

If I have string that contains "&" it is determined as valid from this function, but is not valid for XML.

# re: Escaping/Unescaping XML Data

Thursday, July 22, 2010 1:57 AM by smehaffie

nnn - Not sure what you mean, in the code I posted it saw "Ben & Jerry's" as invalid and returned it with the & escaped.  So for that to have done that !SecurityElement.IsValidText("Ben & Jerry's")would have had to be false.  What version of .Net did you notice this behavior?

I will double checked the code to see of something has changed and if so I will update the post.  Thanks for the heads up.

# re: Escaping/Unescaping XML Data

Friday, December 03, 2010 6:50 PM by Jon

     if (string.IsNullOrEmpty(s) || !s.Contains('&')) return s;

would improve the unescape

Leave a Comment

(required) 
(required) 
(optional)
(required)