This blog has moved http://www.sharplogic.com/blogs/ed

This blog has moved here<!--meta http-equiv="refresh" content="0;url=http://www.sharplogic.com/blogs/ed"-->

The One-Stop Shop For All Your CSV Parsing Needs

I was looking through the referrers page to see where my hits come from, and I noticed two very interesting things (or at least they're interesting to me). First, 90%+ of my hits come from "(none)", which I assume means that they're from aggregators that consume RSS feeds without leaving a http_referrer calling card. The second, and perhaps more interesting stat, is that I am regularly getting hits from people who search google for tips on parsing CSV formats. I figured they must get annoyed when they click through hoping for a juicy code tidbit or some other fun fact, so I decided to try to help out a bit.

Here is a C# function for parsing a CSV file into a DataSet. It should work after a quick copy/paste, but if not, let me know and I'll update it.

/// <summary>

/// Parses a file in the comma separated value format (CSV) and returns the data as a DataSet.

/// </summary>

/// <param name="filePath">The path to the CSV file</param>

/// <returns>A DataSet with the parsed data from "filePath"</returns>

DataSet ParseCSVFileIntoDataSet(string filePath)

{

      // Make a DataSet and add a blank table to it so we can

      // store all the data we're going to get

      DataSet ds = new DataSet("MyDataSet");

      DataTable table = ds.Tables.Add("MyTable");

 

      // Create an instance of StreamReader to read from a file.

      // The using statement also closes the StreamReader.

      // We'll also let it throw a FileNotFoundException if necessary.

      using (StreamReader sr = new StreamReader(filePath))

      {

 

            // If your CSV contains column info as the first row use this first section.

            // If not, use the section below where columns are hardcoded.

#if YOUR_CSV_CONTAINS_COLUMN_NAMES

            string columnLine = sr.ReadLine();

 

            // string.Split will parse each string between the token into separate

            // strings in an array.

            //

            // For example, "FirstColumnName,SecondColumnName,ThirdColumnName"

            // if split on commas will become an array of three strings:

            // "FirstColumnName", "SecondColumnName", "ThirdColumnName".

            //

            // Just be careful that the names can't have commas in them already.

            string[] columns = columnLine.Split(',');

#else

            string[] columns = { "FirstColumnName", "SecondColumnName", "ThirdColumnName" };

#endif

 

            // Make sure we have some columns.

            if (columns.Length == 0)

            {

                  throw new ApplicationException("There are no columns available to parse");

            }

 

            // Iterate through each column name and add to the table.

            foreach (string column in columns)

            {

                  table.Columns.Add(column);

            }

 

            // Iterate through the remainder of the file and parse each row.

            string nextLine;

            while ((nextLine = sr.ReadLine()) != null)

            {

                  // Split the cells in this row.

                  string[] cells = nextLine.Split(',');

 

                  // Make sure we have the right number of cells to match the columns.

                  if (cells.Length != columns.Length)

                  {

                        throw new ApplicationException(

                              string.Format("Error in \"{0}\": Data row {1} has {2} cells when it should have {3}",

                              filePath, table.Rows.Count + 1, cells.Length, columns.Length));

                  }

 

                  // Add the new row with the data.

                  DataRow row = table.Rows.Add(cells);

            }

      }

 

      // Return the filled DataSet

      return ds;

}

Posted: Jul 30 2003, 08:05 PM by EdKaim | with 2 comment(s)
Filed under:

Comments

Mark Brown said:

Ed, I was probally one of those Googlers. Here is what I ended up with: <a href="http://www.dotnetjunkies.com/weblog/markbrown/posts/570.aspx">http://www.dotnetjunkies.com/weblog/markbrown/posts/570.aspx</a>
# July 31, 2003 12:14 AM

Slavomir Furman said:

Also Dino Esposito in his book "Applied XML Programming for Microsoft .NET" has implemented CSV Reader, IIRC.
# August 1, 2003 10:04 AM
Leave a Comment

(required) 

(required) 

(optional)

(required)