Visual C++ in Short: Regular Expressions

ATL includes a lightweight regular expression implementation. Although originally part of Visual C++, it is now included with the ATL Server download.

The CAtlRegExp class template implements the parser and matching engine. Its single template argument specifies the character traits such as CAtlRECharTraitsW for Unicode and CAtlRECharTraitsA for ANSI. The template argument also has a default argument based on whether or not _UNICODE is defined.

The CAtlREMatchContext class template provides an array of match groups for a successful match of a regular expression. It has the same default template argument as CAtlRegExp.

In the example below, CAtlRegExp’s Parse method is used to convert the regular expression into an instruction stream that is then used by the Match method to efficiently match the input string. Each match group is defined by a start and end pointer, defining the range of characters so that a copy does not have to be made if it is not needed.

The regular expression grammar is defined at the top of the atlrx.h header file.

#include <atlrx.h>
#include <atlstr.h>

#define ASSERT ATLASSERT

int main()
{
    CAtlRegExp<> regex;
    const REParseError status = regex.Parse(L"^/blog/{\\d\\d\\d\\d}/{\\d\\d?}/{\\d\\d?}/{\\a+}$");
    ASSERT(REPARSE_ERROR_OK == status);

    CAtlREMatchContext<> match;

    if (regex.Match(L"/blog/2008/7/16/SomePost", &match))
    {
        ASSERT(4 == match.m_uNumGroups);

        PCWSTR start = 0;
        PCWSTR end = 0;

        match.GetMatch(0, &start, &end);
        const UINT year = _wtoi(start);

        match.GetMatch(1, &start, &end);
        const UINT month = _wtoi(start);

        match.GetMatch(2, &start, &end);
        const UINT day = _wtoi(start);

        match.GetMatch(3, &start, &end);
        const CString name(start, end - start);

        wprintf(L"Year: %d\n", year);
        wprintf(L"Month: %d\n", month);
        wprintf(L"Day: %d\n", day);
        wprintf(L"Name: %s\n", name);
    }
}

If you’re looking for one of my previous articles here is a complete list of them for you to browse through.

Produce the highest quality screenshots with the least amount of effort! Use Window Clippings.

8 Comments

  • Why not install Feature Pack for Visual C++ 2008 and be done with it?

    It does ship with a TR1 implementation which means basic_regex is available.

  • Tanveer Badar: The TR1 implementation is fantastic. I don’t however use exceptions in most of my native code projects and that is why I prefer ATL classes over STL classes. ATL doesn’t force you to adopt exceptions whereas the STL does.

  • Arguably, for new projects, using the C++ TR1 regex implementation, which will be part of C++0x, is more future-proof.

    Arno

  • Arno Schoedl: absolutely. As I said, provided you’re using exceptions as part of the error handling strategy for your project it makes perfectly good sense to use the Standard C++ Library and the TR1 additions. A lot of developers however still use C++ without exceptions and for them a lightweight ATL alternative comes in handy.

  • I have to use ATL regexps for one of our work projects and the regexp syntactic differences are often a major annoyance. It's not too hard to get used to the syntax, but when you also do regexps in .NET (which uses a more universal style) it gets a little weird at times. So unless you really have to, I'd say stay away from ATL regexps.

  • Nish: have you compared the .NET and TR1 grammars?

  • Kenny : Nope, not yet. Why? Don't tell me they are quite unlike each other too!!!

  • Nish: Like night and day. Night and day. :) Seriously I haven’t played much with the TR1 impl so I can’t really judge yet. The only regex I’ve used extensively is the .NET one. I just like the ATL impl when I’m in a tight corner and can’t or don’t want to have any baggage just to parse a string in some interesting way. I have heard from Stephan that the TR1 impl is really kick ass though.

Comments have been disabled for this content.