Characters filteration or validation for XML document.
I was having hard time to filer form input from a asp.net
page. This page accept text as input and send to database in
terms of xml formats.
All are ok when the text doest
not include any of the following character. It gone for toss
when you use these charactes
|
Character Name |
Entity Reference |
Character Reference |
Numeric Reference |
|
Ampersand |
& |
& |
& |
|
Left angle bracket |
< |
< |
&#60; |
|
Right angle bracket |
> |
> |
> |
|
Straight quotation mark |
" |
" |
' |
|
Apostrophe |
' |
' |
" |
Microsoft has one suport article but it does not give
sufficent help for this problem.
How to locate and replace special characters in an XML
file with Visual C# .NET
I tried different way to workaround this and I could be
following any of these
1 - You simple filter the text
using c# or VB.Net Replace method
Like
// >
newString = newString.Replace(">",">");
// <
newString = newString.Replace("<","<");
//&
newString = newString.Replace("&","&");
//Double Quote " –This does not work here…
//
newString = newString.Replace(ControlChars.Quote,""");
Here C# does not
have any class called ControlChars
//
newString = newString.Replace(CHR(32),""");
Replace method accept either both character or both
string not one character and another string. So this also
does not work.
//Single Quote '
newText = newText.Replace("'","'");
2. You can use Regular expression to filter all you special character.It might be something like this @"a1/}{]yryr23dsdhds%$#yytr^&uut887611oiuif():><?jfhgg";
3. You can write stored procedure to replace all these special characters.
Regular expression was good choice but it was killing lots of time to make the expression. I am not so good on regular expression or might not be wanted to waste much time for this simple problem. If somebody could write the expression it would be great and helpful for others
How To Locate and Replace Special Characters in an XML
Document with Visual Basic
Some extra reading
XML Syntax and Parsing Concepts
Manipulating Strings in C#
Happy Coding
Suresh Behera