Is RegExLib full of "it"?

Today I read a comment on Jeffrey Schoolcraft's regex blog from Randal L. Schwartz which I felt that I needed to respond to.  As I started writing the comment I realized that this is probably news that needs to be publicly visible, so I'm posting it to my blog and cross referencing  the original comment.  First, here is Randal's comment:


Yup. I continue to downvote and negative-comment nearly every entry at "regex lib".

Do not validate email addresses with a regex (unless it's the full regex, as you point out).

Do not parse HTML with a regex. HTML is surprisingly complex.

Do not validate a date with a regex. All these regex I see that try to compute the number of days of february based on the year number just have me going "WTF!".

These are NOT regex tasks. These are dedicated tool tasks.

And yet, "regexlib" is full of them. And full of "it", if you know what I mean.


Randal...

I hear your pain.  As the lead developer of RegExLib I also see the problems that you are mentioning and, presently we haven't really provided a good enough toolset for the newbies to really help themselves properly.  Should the newbies be randomly using regex's that they find on the site... dunno?  That's for another argument.

We implemented the rating and comment system in the middle of last year to try and give some indication about the value of individual patterns - so I'm extremely grateful that dilligent members of the community such as yourself are helping out by casting your votes.  We also implement an Rss feed for the comments so that comments such as yours are given public visibility - http://www.regexlib.com/RssComments.aspx

It's a hard battle to win as RegExLib continues to grow and, as of today contains nearly 1000 expressions.  There's good news though.  Over the past couple of months there's been a lot of effort put into helping solve these problems and, to that effect, users of the site will see a vastly improved set of tools to help deal with some of the problems that you've mentioned. 

To give you a quick example, one of the new features will provide users with a shortcut way of finding useful AND ACCURATE expressions by offering a box which says: "Enter N examples of what you want to match and N that you shouldn't and we'll provide you with a list of patterns which that match your requirements".  This will help to remove the hit and miss element of a NOOB scanning through 1000 patterns to find the veritable needle in the haystack.

The tools that allow users to manage their expressions is also getting an improvement so hopefully pattern authors might be more responsive in adjusting their patterns based on feedback received.

I hope that, once you see the new features for yourself you will agree with me that RegExLib is a much more valuable resource than it is today.

3 Comments

  • I have found regexlib.com to be a good resource. I understand that the regexes are user submitted and may not be the most accurate for some tasks, but once you realize that you can modify the regex as you need to. I am new to Regular Expressions and regexlib.com helps me understand what exactly is going on. I am sure I will someday be at the stage where I don't need this resource, but until then it really facilitates understanding of regexes. I also use "RegexDesigner.NET" which is a pretty useful tool as well.

  • The problem with it being a resource is that it'll also be viewed as an authority. There's far too much "cargo-culting" going on in the regex world (and the shared source world in general), where someone will write something kinda flakey, but then it spreads like a bad meme. We definitely have a few of those in the Perl community, and I work very hard to try to stamp those out.



    regexlib.com is a vast repository of cargo cult junk, with a few rare gems to justify it otherwise.

  • It's amazing in support of me to have a web site, which is beneficial for my know-how. thanks admin

Comments have been disabled for this content.