Doug Reilly's Weblog

Embedded Reporting of the Information Age...

DataSet or DataReader?

My thoughts here...

 

Comments

Charles Chen said:

"As with many developer decisions, there is no clear-cut choice between using a DataSet or a DataReader."

Hmmm....in my own thinking, it seems clear that DataSets+DataAdapters are a better choice for smaller projects where:
* The development time can be considerably shortened by avoiding creating business object classes and using the nice relaitonal/table-like features of the DataSet
* The underlying tables are very simple, which again ties in with the RAD aspect of DataSets since you can rely on auto-generated Update-, Delete-, and InsertCommands.

On the other hand, when working on larger object oriented systems, a DataReader will generally always perform better than a DataSet. Using try-catch-finally/using (C# only) (like a any good developer), you can always ensure that if there is an error, you close the underlying connection. You also get more flexibility in how your object relationships are represented, but at the cost of having to build those relationships manually.

So it seems pretty clear-cut to me. DataSets are great for RAD and if your application deals with the underlying data in a table-like manner (typed-DataSets abstracts this somewhat into a more object oriented manner). If you want the best performance and most flexibility, DataReaders+custom business objects are the way to go.
# June 10, 2005 8:37 AM

Atul Thakor said:

A number of things should also be noted here:

With the SQLDataReader you cannot return 'return values' until after the data reader has been closed.

Personally I try to avoid DataSets at all costs as I consider them the work of the devil! Use a DataTable instead as it is more lightweight, and pretty flexible without the XML stuff, or the schema.

Additionally if you really must use a DataSet you can get a little speed increase with the following command:
ds.Enforceconstraints= false
More on this here:
http://dotnetjunkies.com/WebLog/Principal/archive/2005/04/13/65546.aspx

With DatSets and Winforms I'd have to say it depends how much data you plan to manipulate. At the end of the day if you working with say over 250,000 rows it’s a real bad idea.
The dataset will use memory and then the datagrid/listview will also use memory when displaying. Your better off using a SQLDataReader to fill a listview. The SQLDataReader will only store in memory the row being processed, while the dataset stores all the data for the duration of the process.

Also when you run the ExecuteReader ensure that you use
Cmd.ExecuteReader(CommandBehavior.CloseConnection)
This way when you close the SQLDataReader you will also close the data connection automatically.

And finally another tip when using the SQLDataReader try and use the _dr.GetString(0) , _dr.GetInt32(0) methods as you’ll find once again a nice burst of speed.

Regards,
AT
# June 10, 2005 8:47 AM

Charles Chen said:

"One of the features of the DataSet that makes it flexible is that it can be easily serialized into XML"

"DataSets can also easily be passed between layers in an application, even if those layers are not in the same process space. For instance, an application server can get the data from a database and then send off the DataSet across process space, or even across a wire to a separate server doing presentation services."

One thing that I've found that many developers don't know (and I don't know if this is the case with your or not, so forgive me if you are aware) is that the schemas used to generate typed-DataSets can also be used to auto-generate entity-/skeleton-classes using XSD.exe or a more feature rich utility like XSDObjectGen.

What's nice is that these generated classes are automatically marked up with XML serialization attributes. Yes, calling GetXml() is convenient, but using XmlSerializer is not that much more difficult if you really need to serialize your objects to XML.

One argument for *un*-typed DataSets is that the receiver doesn't need to reference the source assembly but at the cost of a highly error prone architecture since then you're relying on the underlying runtime table/column names to remain the same.

"One possible disadvantage of using a DataReader is that the connection must be open while you are accessing the data."

Again, typically, one would iterate through the DataReader and copy the data into business objects/entities no? At the core, this is the same operation that a DataSet would have to use to populate each row, no?

I reiterate that my feeling is that on large object oriented projects, unless you are really using all of the powerful features afforded by DataSets (which most people *don't*), then the DataSet represents a huge overhead in terms of memory, performance, and usabilty (with respect to untyped-DataSets).

# June 10, 2005 8:52 AM

Douglas Reilly said:

Charles,

I think what I meant as to there being no "clear cut" choice between DataSets and DataReaders is that the decision is more complex than, say, "Use DataSets in Windows Forms Applications" or "Use DataReaders for Web Forms applications." Rather, the need to cache, the need to transport results across process boundaries, etc. can move you from one decision to another.

I concur that the overhead of a DataSet is huge if you are not going to use all the features of the DataSet [and frankly, using a DataSet for updates, etc., makes me twitch a bit ;-)].
# June 10, 2005 9:55 AM

TrackBack said:

# June 10, 2005 3:10 PM

TrackBack said:

# June 13, 2005 4:36 AM

WheresTheBeef said:

Why the example code in C#? You going to the dark side??
# June 13, 2005 12:18 PM

Douglas Reilly said:

C# is what I normally use, actually. This code, in particular, seems pretty readily transalatable, as it is mostly calls methods of the ADO.NET objects...
# June 13, 2005 12:22 PM

Binoj Antony said:

You said

"Personally I try to avoid DataSets at all costs as I consider them the work of the devil! Use a DataTable instead as it is more lightweight, and pretty flexible without the XML stuff, or the schema."

Well, a DataTable contains the schema information my friend, only thing u miss when using DataTable instead of DataSet is the relations et all

I think with the hardware getting cheaper every day, it makes sense to use dataset, the problem with datareader is that the connection has to be persisted till u close it explicitly (commandbehaviour.closeconnection rarely works)
"Late connect and early release" is the force that drives data connections..
# June 17, 2005 1:11 PM

Douglas Reilly said:

To clarify, *I* did not recommend a DataTable as an alternative to the DataSet, that was another comment by a reader of the blog. I think in ADO.NET 2.0, a DataTable is more attractive because because it can be serialized and in some sense it might be easier for newbees to understand (there is not a Tables collection to deal with, just the single data table).
# June 17, 2005 5:28 PM

Scott Bellware said:

Of course, you could just dump the whole nasty mess of data access training wheel objects and simply use custom entity objects and a comprehensive data access framework like NHibernate.

That we're still having this particular discussion at this time in our history is like listening to a couple of modern-day physicians discussing the merits of blood letting and scull hole boring as potential treatments for impotence.
# June 23, 2005 1:05 PM

Douglas Reilly said:

For anyone who does not know NHibernate (I did not) there is a page on it here:

http://wiki.nhibernate.org/display/NH/Home

I am not yet convinced that tools like NHibernate will actually take over the world. Using my own custom classes, I have built a reasonable framework for data access (which under it all uses DataReaders for the most part).
# June 23, 2005 1:15 PM

twh said:

This article "SQL Server Connection Pooling Myths" (http://www.sql-server-performance.com/sk_connection_pooling_myths.asp) says:

"[...] opening n forward-only, server-side, recordsets will, in fact, spawn n connections. By design, SQL Server can only open one forward-only cursor per connection at one time."

Does this mean DataReaders defeat connection pooling?
# June 23, 2005 2:49 PM

Douglas Reilly said:

Not at all. It only means that each DataReader requires a seperate connection. My opening using and closing DataReaders and their underlying connections ASAP, you can take advantage of connection pooling.

ADO.NET 2.0 allows multiple active result sets (MARS) and so will allow multiple datareaders on a single connection.
# June 23, 2005 3:32 PM

Jason Melvin said:

I have one important note to all who plan on implementing the DataReader with the CommandBehavior.CloseConnection parameter. Unfortunately I don't have the URL to the article, but in my search to stabilize a system with huge connection pool and connection count problems I came across an article that discussed connection pooling and how it functions in .net and when it came to the discussion of that particular parameter there is a misconception that it will always close the connection for you. Unfortunately in ASP.Net that parameter only works when the SqlDataReader is consumed by a "complex" web object such as the DataGrid, and *WILL NOT* be closed automatically if you simply iterrate through the records and then let the object fall out of scope.

Being a developer who has many years of experience in classic ADO and in web development I have the great habit of always being overly zealous of closing and *destroying* my DB objects, but this seems to be something that is quite a common misconception and I have seen the mistake in code in the site that I am currently working on. Just a thought...
# July 6, 2005 2:54 PM

Douglas Reilly said:

I always explicitly close the DataReader in a Finally block, no matter where the DataReader is used.
# July 6, 2005 2:56 PM
Leave a Comment

(required) 

(required) 

(optional)

(required)