Suresh Behera

The Microsoft .Net Junkies

News

Blogroll

Reading

Characters filteration or validation for XML document.

I was having hard time to filer form input from a asp.net page. This page accept text as input and send to database in terms of xml formats.
All are ok when the text doest not include any of the following character. It gone for toss when you use these charactes

 

Character Name

Entity Reference

Character Reference

Numeric Reference

Ampersand

&

&

&

Left angle bracket

<

< 

&#38;#60;

Right angle bracket

&gt;

> 

&#62;

Straight quotation mark

&quot;

"

&#39;

Apostrophe

&apos;

'

&#34;

Microsoft has one suport article but it does not give sufficent help for this problem.
How to locate and replace special characters in an XML file with Visual C# .NET

I tried different way to workaround this and I could be following any of these
1 - You simple filter the text using c# or VB.Net Replace method
Like

// >

newString = newString.Replace(">","&gt;");

//  <

newString = newString.Replace("<","&lt;");

//&

newString = newString.Replace("&","&amp;");

//Double Quote " –This does not work here…

// newString = newString.Replace(ControlChars.Quote,"&quot;");

 Here C# does not have any class called ControlChars

// newString = newString.Replace(CHR(32),"&quot;");

Replace method accept either both character or both string not one character and another string. So this also does not work.

 

//Single Quote '
newText = newText.Replace("'","&apos;");

 2. You can use Regular expression to filter all you special character.It might be something like this @"a1/}{]yryr23dsdhds%$#yytr^&uut887611oiuif():><?jfhgg";

3. You can write stored procedure to replace all these special characters.

Regular expression was good choice but it was killing lots of time to make the expression. I am not so good on regular expression or might not be wanted to waste much time for this simple problem. If somebody could write the expression it would be great and helpful for others

How To Locate and Replace Special Characters in an XML Document with Visual Basic

 

Some extra reading
XML Syntax and Parsing Concepts
Manipulating Strings in C#

 

Happy Coding

 

Suresh Behera

Posted: Feb 24 2006, 04:24 PM by Suresh Behera | with 7 comment(s)
Filed under: ,

Comments

Stuart Ballard said:

How about just System.Web.HttpUtility.HtmlEncode(str)?
# February 24, 2006 4:40 PM

Hema said:

Hi Suresh,
Did you happend get through this problem, how did you fix this. I am in the same boat. if possible pls share your thoughts on this.

Cheers,
Hema
# March 9, 2006 8:11 AM

annamalai said:

nice

# August 25, 2006 7:03 AM

Aru said:

is there any other ways to replace the special characters other than the above mentioned five

# October 5, 2006 1:25 AM

Aru said:

plz tell me how to replace with other characters like #,$,%,^,*,@,! and so on..................

# October 5, 2006 1:26 AM

jon@gobesthome.com said:

Thanks for saving from a nightout. I had to write a regular expression which allows a defined set of characters and was not able to find out how to put a " in the regular expression. Most of the characters can be directly place inside a Validation expression but not a ". For the benefit of other here I am putting my validation expression also:

ValidationExpression="^[0-9a-zA-Z.,!@#$_%/&*:+=><;\{\}\[\]()\s\-&quot;]

# October 20, 2007 8:36 PM

Rob Gonzalez said:

What is the stored procedure solution?

Does anyone have one?

# November 28, 2007 10:08 AM
Leave a Comment

(required) 

(required) 

(optional)

(required)