Escaping/Unescaping XML Data

With as popular as XML is, when creating XML files you need to be able to escape certain characters that will not parse correctly if they are not escaped.  Until recently I always did this like most other .Net programmers, I wrote a function to do it.

    1 public static string EscapeXML(this string s)

    2 {

    3     if (string.IsNullOrEmpty(s)) return s;


    5     string returnString = s;

    6     returnString = returnString.Replace("'", "'");

    7     returnString = returnString.Replace("\"", """);

    8     returnString = returnString.Replace(">", ">");

    9     returnString = returnString.Replace("<", "&lt;");

   10     returnString = returnString.Replace("&", "&amp;");


   12     return returnString;

   13 }

Recently I came across a function hidden within the .Net framework that does this within on line of code.  It is located in the System.Security class.  Under the SecurityElement there is a function that called Escape() that takes a string as a parameter.  This basically does what the above function does with one line of code.  So the function above is simplified to the code below.

    1 public static string EscapeXML(this string s)

    2 {

    3     if (string.IsNullOrEmpty(s)) return s;


    5     return SecurityElement.Escape(s);

    6 }

So now the only question is how this can be an even better solution.  The answer is by using extension methods.  You can create a class called StringExtensions the contains the code below.  I personally have a PCRExtensions project that I put all my extension methods, that way I can use them in any project that needs them.

    1 using System.Security;


    3 namespace YourExtensions

    4 {

    5     public static class StringExtensions

    6     {

    7         public static string EscapeXML(this string s)

    8         {

    9             if (string.IsNullOrEmpty(s)) return s;


   11             return !SecurityElement.IsValidText(s)

   12                    ? SecurityElement.Escape(s) : s;

   13         }


   15         public static string UnescapeXML(this string s)

   16         {

   17             if (string.IsNullOrEmpty(s)) return s;


   19             string returnString = s;

   20             returnString = returnString.Replace("&apos;", "'");

   21             returnString = returnString.Replace("&quot;", "\"");

   22             returnString = returnString.Replace("&gt;", ">");

   23             returnString = returnString.Replace("&lt;", "<");

   24             returnString = returnString.Replace("&amp;", "&");


   26             return returnString;

   27         }

   28     }

   29 }

So now that you have the above extension methods written for string objects, you can call them by using them just like any other built in function.  And the nice thing is, they even show up in intellisense so you can just pick the method from the list just as if they were part of the framework.  This is what makes extension methods so powerful.

   1 string xmlUnescapedText = "Ben & Jerry's";

   2 xmlUnescapedText.EscapeXML();  //returns "Ben &amp; Jerry&apos;s"


   4 string xmlEscapedText = "Ben &amp; Jerry&apos;s";

   5 xmlEscapedText.UnescapeXML();  //returns "Ben & Jerry's"



  • That was excatly what I was looking for. Very nice tool Shawn.

  • The blog is great. Thank you. But this condition is not well
    If I have string that contains "&" it is determined as valid from this function, but is not valid for XML.

  • nnn - Not sure what you mean, in the code I posted it saw "Ben & Jerry's" as invalid and returned it with the & escaped. So for that to have done that !SecurityElement.IsValidText("Ben & Jerry's")would have had to be false. What version of .Net did you notice this behavior?

    I will double checked the code to see of something has changed and if so I will update the post. Thanks for the heads up.

  • if (string.IsNullOrEmpty(s) || !s.Contains('&')) return s;
    would improve the unescape

Comments have been disabled for this content.