ASP.NET Hosting

Manipulating CSV files

It doesn't seem to be well known that you can easily access data in CSV files (Comma-Separated Values) using ADO.NET components. For example, you can use the OleDbConnection, OleDbCommand, and OleDbDataAdapter objects to fill a DataSet from a CSV file. You can also update the data. Also notice that it is possible to use Visual Studio's Server Explorer to visualize or edit CSV files.

All you need is to use the Jet engine thanks to a connection string like the following one:

"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\TxtFilesFolder\;Extended Properties='text;HDR=Yes;FMT=Delimited'"

Note: "HDR=Yes" indicates that the first row contains columnnames, not data (source: connectionstrings.com).

Once you have a connection to a directory, you can refer to files as tables as follows:

"SELECT * FROM data.txt"

Hint: Think about JOINs if you need them...

So, why not use ADO.NET's power to access CSV files instead of coding this all by yourself?

Update: this post seems very popular, so I guess it's useful that I add two links about the Schema.ini file for more information:

Update: interesting comment by David:

Recap: Reading columns with mixed data types (that is, with data that could appear to a stupid OleDB driver to be of two types), you must use registry modifications AND extended properties on the connection string.

For Excel, the connection string is:

string.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};" + @" Extended Properties=""{1}""", pathName, "Excel 8.0;HDR=YES;IMEX=1");

In the registry, set:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel\ImportMixedTypes = Text

and

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows = 0

For CSV, the connection string is:

string.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};" + @" Extended Properties=""{1}""", path, "Text;HDR=YES;FMT=Delimited;IMEX=1");

In the registry, set:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text\ImportMixedTypes = "Text"

and

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text\MaxScanRows = "0"

85 Comments

  • We still get data via CSV and this is what I've been doing. God bless connectionstrings.com

  • Hadn't seen your post Don!

    You beat me on this one ;-)

  • A couple of drawbacks on reading CSV files...

    1. The CSV file has to be on a disk that has Read/Write capabilities (i.e. not on a CD)

    2. Trying to read >100,000 records kinda locks up the process (I have waited 10 minutes and it still was not finished). Under 100,000 records is pretty quick.



    Eric

  • Hi Fabrice,



    Were you kidding, or you found a way to use joins between CSV files/tables? I just can't make joins work :-(



    This short sample will be really useful for quick demos ;-) Thx!

  • Joins work for me.

    Sample SQL:



    SELECT A.FieldA1, B.BieldB1

    FROM ([B#txt] B INNER JOIN [A#txt] A ON B.FieldB2 = A.FieldA2)



    Christophe, did you try without using the JOIN syntax, simply with '='? Maybe it would work better, dunno.

  • Stoooopid me! I had a "syntax error" (c)(tm) in my second CSV file ;-) Both JOIN and "=" syntax do work. Thx!

  • Voila, avec un article c'est tout de suite plus clair :-)

    To be noted that since that post, I've used this technique for inserts as well and it works fine.

  • I'm getting an OleDbException when trying to use this, the message source stating "Could not find installable ISAM."



    has anyone got this working on W2k3 ?

  • Can one connect to CSV files on remote folders?

    Please let me know

  • I am having trouble getting LIKE to work (e.g., "... WHERE field LIKE 'ABC*'") when trying to access data in a CSV. Have you run across this?

  • how its possible in Linux?

  • I've got reading down can somebody help me with updating the csv file.

  • When I tried to update, I got "Updating data in a linked table is not supported by this ISAM." When I did some searches, I found Microsoft saying, in a similar situation, that "ODBC drivers do not allow for the SQL statements UPDATE and DELETE."



    Has anyone seen it work? It seems that CSVs can be read like a database, but not updated....

  • Oran, you can insert for sure, but maybe you cannot update or delete. I don't remember.

    A quick test would be to try updating a CSV table with Visual Studio's Server Explorer. I think it worked last time I tried.

  • This works fine as long as my file name is simple...but once it has character like '-' it fails. How do I get around this problem?

  • I've got around the problem....I used MSDASQL.1 instead of Microsoft.Jet.OLEDB.4

    and ADODB connection class instead of OLEDBConnection.

  • Thanks so very much ... had to go through a lot of websites to get a CSV connection string, and yours was the first one that worked ;-)

  • Hi,



    I'm getting unspecified error in the 2nd line when opening the oledbconnection



    conn = new OleDbConnection(strConnectString);

    conn.Open();



    Connection string is like

    "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\TxtFilesFolder\;Extended Properties='text;"

    i've installed the latest mdac 2.8, updated the service pack to 8 for the Jet driver. Changed the permission in the folder. still i'm getting the same error.



    Can some tell me how to fix the issue...





  • Can somebody help me with this problem.

    I have an CSV file with multiple columns. The strange part is that the dataset is filled with only the first column ?



    I tried it with a excel file and all data was read from it into the dataset with the right columns ?



    Thanks,






  • I'm have a problem where the driver is converting a small decimal to scientific notation. The actual value in the column is .000001 and its showing up as -1e-6



    Is there anyway to force the driver to import as text? I've tried it both with csv and excel files.

  • Eric, did you try to play with the Schema.ini file to type the column as text?

  • I found the problem. I was converting everything to string. Once the datatable was filled, the driver made the column of type Decimal. If i pulled the value out as decimal, i am fine...........

  • Hi, just wondering if this works with the ADO, in stead of the ADO.NET. Has anyone tried that with the ADO object? Thanks.

  • I found that using this driver filenames like test.1083952525531.txt don't work. Anyone else find this?



    I end up getting the following exception when i run my "select * from test.1083952525531.txt" command



    .Data.OleDb.OleDbException: The Microsoft Jet database engine could not find the object 'test.1083952525531.txt'. Make sure the object exists and that you spell its name and the path name correctly.



    when i change the file and select statement to testT1083952525531.txt.. it works..



    strange

  • I have a problem converting the CSV file into the DataSet.

    The problem is that if the first few records of a column in the CSV file are numeric, the datatype of the column in the DataSet automatically gets set as Int.

    So lets say after 10 records of numeric, if the eleventh record is alpha numeric, that does not get filled in the data set at all.

  • "if the first few records of a column in the CSV file are numeric"



    All your records should have the same columns and column types. The type is not supposed to change depending on the record. This is a requirement for every database, not just CSV files.

  • I used Oledb provider to convert the CSV file to DataSet. I did not define the columns & their types in the DataSet as there are hundreds of columns. As you correctly pointed out, the data type of all the records in the CSV file is text.

    But as they get filled in the dataSet, the columntype gets defined as Int just because the first few records in the CSV file contain no alphanumeric characters(though they are text).

    For example, the first 10 records' data look something like this. 1000,2000,3000,and so on and the eleventh record will have something like 3000-AV.

    Hope I am clear in explaining the problem.

  • I get it. I think you have to use the Schema.ini file to specify that you want to access the columns as plain text.

  • Hi Is there a way to define the schema when importing Excel files instead of CSV files?

  • If you publish a web page to the intranet you cannot specify the drive letter. ie (c:/TextFolder). Does anyone know how to get round this?

  • Let me clarify. The text data to be imported is on the web. How do you specify the path ?

  • I have had trouble Deleting and Updating records in a CSV file. I can Insert OK. When I delete I get the exception "Deleting data in a linked table is not supported by this ISAM." For updates I get "Updating data in a linked table is not supported by this ISAM."

  • "select * from file[1].csv" this query is not working. anyone tell me the solution for this.

  • Does anyone have a good solution to DJ's problem mentioned earlier in this post? I have run into the same problem, but I don't like the answer he was given. He specifically mentioned that he had hundreds of columns to work with which makes the schema.ini solution quite a bit of work. Any other ideas?

  • has anyone seen setting HDR=NO yet the first row of the CSV is still treated as column headers?



  • This was extremely helpful and saved me lots and lots of headache.
    10x :-)

    P.S. everything works properly like in the first example….

  • Great way to read a CSV file! However, I ran into a small problem. The text engine always ignores the leading zeros in the CSV file. For example, 001234 will be read as 1234. How do I force it to leave the leading zeros?

    Thanks for your help!

  • I found a rather simple workaround solution that may help someone. Using the Registry Editor to set:

    ImportMixedTypes=Text

    for this key:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text

    The default setting for ImportMixedTypes is Majority Type.

  • if the csv file has an erroneous quote or some other bit of data that the driver can't resolve I dont get any data returned for the rest of the line, fair enough but does anyone know if I can find out when this occurs (i.e. does the driver set some property or error status to indicate that it couldnt resolve the line, I cant find anything but would have thought it would be exposed... :-(

  • Maybe libraries like FileHelpers can help?
    See http://filehelpers.sourceforge.net

  • Thanks Fabrice, I will have a look! ;-)

  • Has anyone had any issues with text case sensitivity? My data files contain related records, however are different case or mixed case. (E.g. PETER, peter, Peter).

    Unfortunately, my SQL query is returning different results when I specify WHERE Name='PETER' vs. WHERE Name='peter'. Is there anyway to configure the connection to ingore case and return all records regardless of case?

  • Great post. I have been able to pull a csv wholesale into a datatable. However I have not been able to select specific columns by name (I use HDR=YES). For instance I have headings 'phonenumber', 'cost', 'date' and some other columns, but only want to select those three. How do I do that by name, independent of the other they are in the file? Thanks.

  • M Kumar wrote:
    "This works fine as long as my file name is simple...but once it has character like '-' it fails. How do I get around this problem?"

    The same way you get around any special characters in table names. Enclose the table name (file name in this case) in []'s like this:
    "SELECT * FROM [Some-file.txt]"

  • Fantastic post. As others have said, this has saved me hours of time and several hundred bucks potentially purchasing a third party component to parse csv files!

    Thanks again!

  • I am trying to fill a daaset from a .csv file. The connection string I am using:"Provider=Microsoft.Jet.OLEDB.4.0; Data Source=C:\\myFolder\\Processed\\; Extended Properties=text;";
    Select command I am using :objAdapter.SelectCommand = new OleDbCommand("SELECT * FROM " + FileName + ".csv", objConn); and after this objAdapter.Fill(objDataset, inputFileName);.
    After filling DataSet contains same no of columns as in .csv file but headers contain absurd values and also all the data row columns are blank.
    I tried adding HRD=Yes in connection string but that also didn't solve my problem.

    Can somebody please help me out as soon as possible. Looking forward for help

  • My problem solved. This was because i was creating .csv in excel directly. When i changed the format and created the file using notepad and saved with extension .csv it works fine :)....

  • Great post! But I only get 1 column per row, consisting of one string (example "A,,,1,62,,080") instead of each column getting its own value. Do you have any idea to what may cause this?

  • Hello,

    I would want to know if today, there a best way to load data from a CSV ?

    Otherwise, I need to load a part of a CSV file in a DataTable, and I don't know how to do.

    In my CVS file, I have in first one line for information,then one empty line, then n lines (n is never the same), then an empty line, then the table to load in the DataTable. But in the table, I don't want the first 4 columns.

    After I have an empty line and another table I need to load.

    So, How to do to do a "SELECT * FROM" to get only the first table or only the second table, without others lines which are not table ?

    Thank you for your help.

  • When you've got a funny filename, enclose the filename in squre brackets:
    "select * from [My.Funny_File.Name.csv]"

  • hi every one
    I've got this error when trying to open oledb connection in a asp.net webpage.: "Could not find installable ISAM."
    It's my code:

    OleDbConnection cn = new OleDbConnection();
    string csvLocation = Server.MapPath("ipcountry.csv");
    string cnSr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + csvLocation + ";Extended Properties='text';HDR='Yes';FMT='Delimited';";
    cn.ConnectionString = cnSr;

    string olestr;
    olestr = "select * from ipcountry.txt";
    OleDbCommand cmd = new OleDbCommand(olestr, cn);

    DataSet ds = new DataSet();
    OleDbDataAdapter Objadr = new OleDbDataAdapter(cmd);

    cn.Open();

    Objadr.Fill(ds, "T");
    cn.Close();

  • I am using this kind of connection with a SQLBulkCopy and running into problems. The select query seems to convert SSN's like "003-15-1984" to a DATETIME which is causing me huge headaches. I can't seem to use CONVERT(VARCHAR, SSN) in the select statement. Anyone have any ideas?

    Thanks in advance

  • To Maryam,

    I think I ran into your problem as well... The believe the way you have the extended properties written is wrong. Try changing from...
    Extended Properties='text';HDR='Yes';FMT='Delimited'

    To...
    Extended Properties='text;HDR=Yes;FMT=Delimited'

    You have the extended properties broken up by apostrophes. Hope this works for you.

    Jarrod

  • To Jarrod
    Thanks alot. I will try it.but my programming OS is Windows 2003 Server. is it possible that oledb drivers on this OS lead to this problem?

  • To Jarrod
    I changed the Connection string as follow:

    string cnStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + csvLocation + ";Extended Properties='text;HDR=Yes;FMT=Delimited';OLEDB:Max Buffer Size=256;";

    But Unfortunately still I get the Same "Could not find installable ISAM." Error

  • I've fixed the broken links

  • Is there a limit to the file name length?

    I have some wierd file names that I need to work with and it seems that either the file name length or the naming convention is causing an exception of: "Syntax error in FROM clause."

  • I'm getting the csv to import just fine; however, my schema.ini isn't picking up the date format.

    The date appears as 2007-08-04 11:23:12.230.
    In the schema.ini file, I've included "DateTimeFormat=yyyy-mm-dd hh:nn:ss.fff" and Col3=RecDate date width 23

    I've tried a dozen different ways to format the hours/minutes/seconds. Regardless, it's returning a blank field. Any help will be appreciated because I don't want to use a stream reader and parse all those fields. Thanks!

  • Here is the updated link from DonXML: http://weblogs.asp.net/donxml/pages/24908.aspx

  • Can you please give the detailed solution to solve the DJ problem previuosly

  • hi,

    1- I got a task to read a sales.csv file and transfer all data to sql server 2005 table caled 'sales'

    2- then read all records to datagrid with DELETE & UPDATE facility.

    Does anybody knows the correct cod as Im new in .net.

    I'll appricated an early reply.

    Thanks in advance to all guys.

    Jan

  • Hi folks,
    I have a problem during to read a file .csv:
    My file is like :
    field1;field2;field3
    dog;ccff45;456
    cat;45556;3445
    bird;3333;dedede

    The result is complete but i don't see the field2 for the first row, like:
    field1;field2;field3
    dog;;456
    cat;45556;3445
    bird;3333;dedede

    Somebody can Help Me?

  • Recap: Reading columns with mixed data types (that is, with data that could appear to a stupid OleDB driver to be of two types), you must use registry modifications AND extended properties on the connection string.

    For Excel, the connection string is:

    string.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};" + @" Extended Properties=""{1}""", pathName, "Excel 8.0;HDR=YES;IMEX=1");

    In the registry, set:
    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel\ImportMixedTypes = Text
    and
    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows = 0

    For CSV, the connection string is:
    string.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};" + @" Extended Properties=""{1}""", path, "Text;HDR=YES;FMT=Delimited;IMEX=1");

    In the registry, set:
    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text\ImportMixedTypes = "Text"
    and
    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text\MaxScanRows = "0"

    David

  • Thanks David. This should help a lot of people.

  • Thats great, but what is the default path for the datasource. My csv file is in source control, so I need a relative path. DeploymentItem Attribute deploys the file from SC to cwd [Out], but Jet/ADO cannot find it.

  • Hi guys, could any of you please help me?

    For something like:
    Dim strSQL As String = "SELECT * FROM [" & getFileName & "]"

    I have a connection string like:
    ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" & getParentFolderName & "; Extended Properties=""Text;HDR=Yes;FMT=CSVDelimited"""

    But when i want to use a join Select like:
    SELECT * FROM ([B#txt] B JOIN [A#txt] A

    how do you build the connection string?

  • If you don't know what the Windows registry is, you should first learn about it before making any changes to it. If you do something wrong with it, you can damage your Windows or software settings.

  • thks I will try something diferent

  • I am running into a limit on field lengths. Using ADODB in a COM object environment, fields that are greater than 512 bytes cannot be read.

    Changing the DefinedSize property of the Field object does not appear to solve the problem.

    Is there an inherent limitation on the size of fields that you can read from CSV files?

    Thanks in advance for any help.

  • I am using this to imports CSV file data into datatable in one of my website. I am facing strange problem. With the same csv file sometime the code works fine, whereas sometime it throws error : unspecified error. I have tried all the solutions (giving rights to folder etc..) but still no luck! I am using OleDBconnection ADO.NET Here is the exception detail. The errors come at dataadapter.fill line

    Unspecified error
    Source : Microsoft JET Database Engine

    Exception Details: System.Data.OleDb.OleDbException

    Inner Exception: No Innner exception could be found.

  • @Kiran - maybe what you need is to google

  • How can i user select to get top 3 or 4 rows from CSV file

  • I've encountered the same issue as with Moran Ben-David. Even when the file name is enclosed in brackets, I still get the OLEDB exception:

    The Microsoft Jet database engine could not find the object 'test.1083952525531.txt'. Make sure the object exists and that you spell its name and the path name correctly.

    I've read elsewhere that this is known issue with the text driver, but haven't seen any KB articles on it.

    Anyone have any leads?

  • Hi,
    As i gothrough the thread, i found that i need to either create schema.ini file or change the registry key to set MAXSCANROWS AS 0. I'm writing the client application, where few users doesn't have ADMIN rights. But to change the registry OR to create schema.ini file i need to ADMIN access. How to resolve this prob? I've asked in various forums and i didnt get any response.

    Will anyone pls help me on this.

    Thanks

  • Hi all ,
    any one able to run this code on IIS6 64bit machine.
    I am having MOSS webapplication.Which runs on 64bit configured IIS.Oledb4.0 will supprot only 32bit Webapp.Can any one help me out for this?

  • Hi,
    Anyone got solution for Moran Ben-David as OJ's problem?I've also encountered same issue,even the file name is enclosed in brackets,I get the excepton:

    "The Microsoft Jet database engine could not find the object 'test.1083952525531.txt'. Make sure the object exists and that you spell its name and the path name correctly."

  • Keep extended properties in " to avoid can not find installable ISAM
    strConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Server.MapPath("") + "\\" + ";Extended Properties='text;HDR=Yes'";

  • Hi,

    I am facing the problem while reading the CSV file for ' OR ".

    Sometimes the data contains “ or ' such as '1122', “222222” The program can't be import if they are showing.

    I think this is due to nature of excel

    Can you please help me out asap.

    Thank You.

    /Ashlesh

  • Hellow Everyone,

    I am getting trouble while accessing fields of the CSV file by names. My CSV file has field names which contails space like "Customer Name","Serial No" etc. How can i filter the records based on the perticular Customer name or serial no.

    Kindly help me. It is very urgent.

    ---
    Thanks ,
    Subhankar

  • Subhankar, did you try with square brackets? Maybe it works. e.g. [Customer Name]

  • Well.. I have a similar problem. I just want the database in a human-readable format like CSV (so it can be committed to a VCS and eventually even merged). The approach I found is the following:

    1. Create SQLite database in temporary file
    2. Import data from CSV to SQLite (using sqlite3)
    3. Connect to the SQLite DB
    3. Do all your stuff
    4. Close connection
    5. Export to CSV (again using sqlite3)
    6. Delete temporary database

    so you need to read/write lots of data on application startup and shutdown.

  • I'm trying the below::
    >>>
    string ConnectionString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + dataPath + ";";
    string CmdText = "select * from [" + fileName_ + "]";

    OdbcConnection Con = new OdbcConnection(ConnectionString);

    OdbcDataAdapter adptr = new OdbcDataAdapter(CmdText, Con);

    DataTable dt = new DataTable();

    adptr.Fill(dt);
    <<<

    Problem is the mixed type issue.
    The registry solution doesn' work here.
    Do you guys think I should rather use JET driver rather than Ms Text Driver??

  • Another thing I noticed is that on 64-bit windows, the registry key was

    HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Text

    Notice the Wow6432Node

  • Thanks for the article; it provided me immense help while getting the ISAM error in connecting to the CSV File. Actually I was stuck up with the Extended Property.

  • Regarding case sensitivity. If you want to do case INsensitive search use

    LCASE(ColumnName) = 'something_in_low_case'

    or

    LCASE(ColumnName) = LCASE(OtherColumnNameOrExpression)

Comments have been disabled for this content.