Philip Rieck

Phil in .net

Don't loose this regex

I really wish I knew where I found this... but after only an hour of digging in my old code, I found the regex to deal with CSV files (that is, handle both quoted and non-quoted values, commas in quoted values, etc).

I know I didn't write the regex pattern. I also know I don't want to lose this and have to try.

// ,(?=([^"]*"[^"]*")*(?![^"]*"))

Regex rex = new Regex(",(?=([^\"]*\"[^\"]*\")*(?![^\"]*\"))");
string[] values = rex.Split( csvLine );
foreach( string v in values)
{
   ...
}

If you have an attribution for me, please let me know. I'd like to give credit to the regex author.

[update] - yes, it was from here : http://radio.weblogs.com/0117167/2003/02/18.html#a132  (that's from Early and Adopter)   (thanks, Darren Neimke!) 

[cross-posted from Philip Rocks, Philip Rieck's new-ish real blog - http://philiprieck.com/blog/]

Comments

anonymous said:

it was me.
# January 16, 2004 2:40 PM

Darren Neimke said:

# January 16, 2004 5:34 PM

TrackBack said:

# January 16, 2004 6:46 PM

Plater said:

This regex fails for the single line:

558851,"VPI - VIRGINIA PHYSICIANS, INC",4900 Cox Rd,GLEN ALLEN,VA,23060-6295,9010852

"VPI - VIRGINIA PHYSICIANS, INC" shows up in the results twice.

# April 28, 2008 3:47 PM

Christian said:

It doesnt work.

The closest to CSV matching is the following, but I havent figured out how to deal with empty values yet, such as:

ID,NAME

,"TEST"

will only return "TEST"

"[^"\\]*(?:\\.[^"\\]*)*"|[^,]+

# June 4, 2011 8:11 AM

anonymous said:

# September 3, 2011 10:04 PM

http://www.deelsonheels.com said:

Incredible points. Outstanding arguments. Keep up the amazing work.

# April 1, 2013 10:35 PM
Leave a Comment

(required) 

(required) 

(optional)

(required)