Jeff and .NET

The .NET musings of Jeff Putz

Sponsors

News

My Sites

Archives

The CPU-eating RegEx

The following Regex takes upwards of five seconds to get through a couple paragraphs of text:

text = Regex.Replace(text, @"(.*)((?<!(\A|<blockquote>|</blockquote>|</p>))(<blockquote>|<p>))", "$1</p>$2", RegexOptions.IgnoreCase | RegexOptions.Compiled);

I'm at a loss to explain why, because I've never had any expression choke like that. I theorize that it might have something to do with the "or" options in the groups, but I don't know for sure. Or perhaps something is causing it to recursively scan the entire text over and over, I'm just not sure what.
Posted: Jan 22 2005, 12:40 PM by Jeff | with 1 comment(s)
Filed under:

Comments

Jeff said:

Nevermind... with some guidance from a friend, I realized that the first capturing group was not necessary and causing the greedy engine to keep looking at the entire text over and over. The following still passes all of my tests and takes about .005 seconds:

text = Regex.Replace(text, @"((?<!(\A|<blockquote>|</blockquote>|</p>))(<blockquote>|<p>))", "</p>$1", RegexOptions.IgnoreCase | RegexOptions.Compiled);
# January 22, 2005 1:21 PM
Leave a Comment

(required) 

(required) 

(optional)

(required)