Stripping HTML Tags

This is my first post in this blog. I hope you find useful.
One of common problem for web developers is "stripping html tags".
in example : in search operations or sending plain text emails by newsletter and .....
in this post i do that with RegularExpressions.

private string StripHtmlTags(string html)

{

    string plainText = Regex.Replace(html, @"<(.|\n)*?>", string.Empty);

    plainText = plainText.Replace("\t", " ");

    plainText = plainText.Replace("\r\n", string.Empty);

    //Remove extra blank spaces

    plainText = Regex.Replace(plainText, "  +", " ");

    return plainText;

}

The function is pretty simple but very useful in most situations.

Have Fun!
Published Sunday, June 29, 2008 12:43 PM by mlife

Comments

# .Net Convertir Html a Plain Text usando expresiones regulares

Pingback from  .Net Convertir Html a Plain Text usando expresiones regulares

# re: Convert HTML To Plain Text By RegularExpressions

Friday, April 03, 2009 9:25 AM by Chano Zamora

good post and useful!

# re: Convert HTML To Plain Text By RegularExpressions

Wednesday, September 16, 2009 2:23 AM by sreekar

Thank You

# re: Convert HTML To Plain Text By RegularExpressions

Wednesday, September 23, 2009 4:18 AM by miteshsura

hello,

so simple yet so useful , I was going to write my own method for this.. saved some time .

# re: Convert HTML To Plain Text By RegularExpressions

Thursday, November 19, 2009 4:45 AM by tont

I wouldn't call this 'converting to text'. More suitable term would be 'stripping html tags'.

# re: Convert HTML To Plain Text By RegularExpressions

Wednesday, December 02, 2009 12:58 PM by bonnie

Good as far as it goes.  However, it does not strip out encoded characters in the HTML such as "&nbsp;" or "&quot;"

Leave a Comment

(required) 
(required) 
(optional)
(required) 
Terms of Use