Matching Nested Contructs

Correction: corrected my pattern so that it works :-) after  I discovered that I hadn't escaped the symbols when I tested it on my new RegexSnippets testing tool (see below) :-)

Although I'd read about the new balancing contruct matching syntax in .NET regex, I'd never tried it out - mostly because I'd never found a good need. Thankfully, last week, a post appeared on the ASPAlliance regex list questioning an example that appeared in the Jeff Friedl book: Mastering Regex Vol 2.

      System.String
        pattern = 
          @"\(
              (?>
                  [^()]+
                |
                  \( (?)
                |
                  \) (?<-DEPTH>)
              )*
              (?(DEPTH)(?!))
            \)";

Basically, the expression captures text starting with a "(" into group named "DEPTH", then, when it finds the matching ")" it stores the capture into an unnamed group - (?<-DEPTH>) - the test at the end is to determine that the "DEPTH" group is empty!

Personally, I'm guessing that this part of the sample is incorrect (?(DEPTH)(?!)) and that it should be something along the lines of (?'DEPTH'(?!)) , but I still couldn't get that to work as I would have expected. If anyone reading this does work out how that line works please let me know as it would be a rather powerful tool to have.I tried the example myself, and came to the conclusion that it either A) was an incorrect example, or B) the feature doesn't work as documented.  Later that night - and several hours into my testing - still no conclusive results!

Unperturbed, I descended further and further into the murky depths of Msdn in my quest to unlock the secrets of the new semantics.  The best document that I came across was this: Grouping Constructs... certainly not a fountain of information, but, it was a start.  Here's a sample that I managed to get working while I was doing my testing:

using System ;
using System.Text.RegularExpressions ;
namespace RegexSnippets.Tests
{
    public class Foo
    {
        public static void Main()
        {
                string source = "before (nope (yes (here) okay) after" ;
                string pattern = @"
                (
                      ((?'FOO'\()         #match opening paren into 'bar' (
                        [^()]*            #then match everything that is not a paren ( OR )
                      )+                  #repeatedly
                      ((?'BAR-FOO'\))     #when a closing paren is found, transfer bar to foo
                        [^()]*            #and consume the trailing chars
                      )+                  #repeatedly
                )+" ;

            Match match = System.Text.RegularExpressions.Regex.Match(
                source,
                pattern,
                RegexOptions.IgnorePatternWhitespace|RegexOptions.ExplicitCapture ) ;

            // display all bar's
            if( match.Success ) {
                foreach( Capture capture in match.Groups["BAR"].Captures ) {
                    Console.WriteLine("found item: " + capture.Value) ;
                }
            }
            Console.ReadLine() ;
        }
    }
}

No Comments