Chad Osgood's Blog

Chad Osgood's Old, Expired Blog

RegularExpressionValidator woes and the semantics of dot

There was a somewhat lengthy discussion on the Win Tech Off Topic list regarding how to use a regular expression to limit the length of a string.  I had recommended the following expression:

^.{0,n}$

Where 'n' is the upper bound; 500 in his case.  Most regular expressions implementations allow you to alter the semantics of the "match anything except newline" metacharacter '.' in the above expression to match newlines.  In .NET, one simply uses RegexOptions.Singleline for the options.  Most other languages that actually have regex literals in the language itself (e.g. Perl, Javascript, ...) allow you to specify such modifiers after the closing regex literal delimiter:

/^.{0,n}$/s

 The 's' modifier in the above expression is equivalent to using the RegexOptions.Singleline in .NET.  This modifier will cause the expression to match correctly against up to 'n' characters with the ability to span multiple lines.

I didn't realize at the beginning of the discussion that he was using a RegularExpressionValidator control in ASP.NET, but that causes issue with using my recommended expression.  If using said validator with the EnableClientScript attribute == true, the validation code will obviously have to be rendered as client-side Javascript.  This means that one can't apply RegexOptions, and it also means you can't use the modifiers in the regex literal as noted above for Javascript as the expression is taken directly from the ValidationExpression attribute of the control; a string literal.  The relevant code in the generated WebUIValidation.js file looks like the following:

var rx = new RegExp(val.validationexpression);

Javascript does have an overloaded RegExp constructor to allow the passing of modifiers, but unfortunately it appears it only supports the 'g' and/or 'i' modifiers (global and ignore case, respectively).

Brad Wilson suggested creating a new validator, which is certainly an option; however, Chris Frazier pointed us to the LengthValidator controls.  It certainly solves the problem, but for those who are curious, you could use the following expression when you haven't the option to change the semantics of the '.' metacharacter:

^(.|\s){0,n)$

An inefficient expression indeed, but sometimes you haven't an option (pun intended).  A character class might be more efficient here (e.g. [.\s]) instead of using alternation, but the '.' metacharacter is no longer a metacharacter in the context of a character class.  Confusing enough?

This post was rather anticlimactic and longer than intended, especially considering this was in response to a mailing list post and not something I needed to do personally.  Hopefully someone can benefit from the above information.

Comments

Darren Neimke said:

Nice one Chad ;-) Thanks for the heads-up!
# May 7, 2003 12:01 AM

Randal Schwartz said:

You can also use

/^[\s\S]{0,n}$/

which says "if it's whitespace, or not whitespace". That's everything!
# May 7, 2003 9:03 AM

T Jackson said:

Thanks for the article.

And thanks for an alternative Randal, however, I tried to your expression but it failed to work correctly apply the character limit under certain circumstances.

For example if I use the expression you provided /^[\s\S]{0,500}$/  on 503 characters which includes 3 carriage return line feeds, the expression validates this as acceptable. It seems not to count the carriage returns in applying the character limit.

Just thought I would share that with you.

# April 8, 2008 6:45 PM

T Jackson said:

As a followup to my post,

The issue I have above, seems to relate to the difference in the way a NewLine is handled client side in a browser and how ASP.NET/Windows handles the NewLine character(s) on the server.

Client side in a browser, the NewLine character sequence seems to consist of only one character.

However, server-side the NewLine is counted as two characters (I assume CR and LF separately).

This difference has caused a few problems for me and am still in the progress of finding a decent solution.

# April 8, 2008 8:15 PM

Using the RegularExpressionValidator for validating length of text | hamang.net said:

Pingback from  Using the RegularExpressionValidator for validating length of text | hamang.net

# September 17, 2008 10:10 AM

WombatEd said:

Thanks for the suggestion.  It was a lifesaver.

BTW, some of your links are out of date:

-  LengthValidator

-  (under "Blogs I Read") Scott Guthrie, Don Box, (I didn't check any others)

# April 19, 2009 11:50 AM

poori said:

I had to restore my computer .. How do I transfer music that is on my iphone already back on to iTunes?

________________

<a href="www.youtube.com/watch iphone 3g</a>

# October 16, 2009 4:27 AM

poori said:

how do i foword text messegess cos sum 1 sent me a text and i wanted to sent that text to sum1 else but i dont know how to do it so can u help me?and my phone is a  <a href="unlockiphone22.com/.../a>

# November 23, 2009 3:46 PM

Using RegularExpressionValidator to validate the length of a String | Stephen Lacy said:

Pingback from  Using RegularExpressionValidator to validate the length of a String | Stephen Lacy

# February 15, 2010 5:48 AM

ipad application said:

-----------------------------------------------------------

"I generally don’t put up in Weblogs but your blog site forced me to, incredible operate.!! lovely …"

# January 4, 2011 11:53 AM

best ipad stand said:

-----------------------------------------------------------

"Super-Duper site! I am loving it!!!! Will arrive again once again - taking you feeds also, Thanks."

# January 9, 2011 12:27 PM

Chris said:

The final expression in the original post has a typo -- the left { should be closed with a }, not a )... here's the correct version: ^(.|\s){0,n}$

Not hard to figure this out, but thought it worth pointing out...

Thanks for this post -- it solved my problem validating text with newlines.

# April 8, 2011 11:21 AM
Leave a Comment

(required) 

(required) 

(optional)

(required)