The CPU-eating RegEx

The following Regex takes upwards of five seconds to get through a couple paragraphs of text:

text = Regex.Replace(text, @"(.*)((?<!(\A|<blockquote>|</blockquote>|</p>))(<blockquote>|<p>))", "$1</p>$2", RegexOptions.IgnoreCase | RegexOptions.Compiled);

I'm at a loss to explain why, because I've never had any expression choke like that. I theorize that it might have something to do with the "or" options in the groups, but I don't know for sure. Or perhaps something is causing it to recursively scan the entire text over and over, I'm just not sure what.

1 Comment

  • Nevermind... with some guidance from a friend, I realized that the first capturing group was not necessary and causing the greedy engine to keep looking at the entire text over and over. The following still passes all of my tests and takes about .005 seconds:



    text = Regex.Replace(text, @&quot;((?&lt;!(\A|&lt;blockquote&gt;|&lt;/blockquote&gt;|&lt;/p&gt;))(&lt;blockquote&gt;|&lt;p&gt;))&quot;, &quot;&lt;/p&gt;$1&quot;, RegexOptions.IgnoreCase | RegexOptions.Compiled);

Comments have been disabled for this content.