Ken Robertson's Blog

Ramblings of a .NET developer

SQL Server autogenerated PKs or big composite keys?

Have been working a lot on the database design for my Senior Project, since our requirements specification are due very soon (and its being very stressful lately).

Noticed a terrible flaw in my database design though.  In going through and estimating the growth of the tables and calculating the size, I found that one table would have an annual growth of 32,000 records per year.  This is somewhat unavoidable.  We estimate 800 users (20 campuses, 20 courses per campus, 2 users per course) and an average of 20 questions on the survey (1 surveys are dynamic, so can't have all the answers in one record, since surveys slightly vary from campus to campus and can always be changed).

The big problem was that in our original design, the size per record was 101 bytes.  That comes to about 3mb of space a year, in a table that is always being searched and constantly updated (more surveys filled out and statistical reports run on current and previous data).

Found we could easily reduce the record size to only 16 bytes by using an autogenerated index in the main table it linked to.  This other table had a big composite primary key made up of the userid (int), campus name (varchar 50), course code (varchar 10), and semester title (varchar 25).  Changed it to an auto increment int id, and back down to a more decent growth size.

Now the question, in most situations, do you use composite keys or auto-generated keys?  I had asked this to one of my instructors and he said he only uses auto-generated keys in rows with a lot of inserts/updates and where unique data might be harder to come by (you don't want to use an email address as a userid if you allows users to change their email address, or if data could somehow reoccur).  After doing the growth estimates on our tables, it looks like a couple of them could use auto-generated ids.

Posted: Dec 03 2003, 02:09 PM by qgyen | with 14 comment(s)
Filed under:

Comments

Ben said:

I've been making databases for a while now and using autogenerated IDs makes it a lot easier to do updates as well as paging though records when you need to.

The HR type DB that is my current project uses a lot of related tables, and using the IDs generated for me takes some of the work off the code and lets the DB take care of it.
# December 3, 2003 5:36 PM

Weston Binford said:

I have always used system-generated keys. It will improve performance especially with foreign key relationships. Your foreign key indexes will be very small relative to the size of one using a composite key.

Just my $0.02.

-Weston
# December 3, 2003 5:46 PM

Mike Gunderloy said:

I end up using autognerated keys, but mostly because it's convenient when I'm writing code. I can't believe that a 3MB table is going to represent any serious performance problem unless you're running SQL Server on horridly underpowered hardware. Changing because of key size strikes me as pretty premature optimization.
# December 3, 2003 5:53 PM

Greg Duncan said:

I used to use auto ID's, but have since moved to using GUID (Unique Identifiers) as my primary key's.

Yes, they are MUCH bigger than INT's, but too many times I've been called on combine/consolidate databases, to maintain many databases based on the same schema, etc etc.

GUID's are nice in that you can either use NewID() to generate a new guid as the default, or create a static GUID (like used in a lookup/ref table which is preloaded with data.)

But to put in my 2 bucks, I like synthetic keys over composite keys. Be them ID's or GUID's synthetic keys are well worth the "extra" bytes required to store them.

Also 32K rows a year is really pretty small for SQL. 32M a year is something else, but I've built highly scalable solutions with tables in the millions of rows...

SQL Server just rocks.... :)




# December 3, 2003 7:38 PM

Scott Mitchell said:

What I don't like is when people have a many-to-many join table, and the two tables have PKs, and they add those as FKs to the join table, but then don't make the two FKs a composite PK, but rather introduce a synthetic, auto-generated PK for the join table.

As far as an auto-generated PK for a table that is not joining together multiple tables, I don't see the harm in that, and do it often myself.
# December 3, 2003 8:21 PM

chadb said:

I've been using GUID's for keys for a few years now on both Oracle and SQL Server. It is by far the best approach I've used. You can have the database auto-generate when you don't need to know the keys beforehand - but the best part is being able to generate a new object graph without having to make round-trips to the database. Just Guid.NewGuid() and create your related entities and persist away!!
# December 3, 2003 9:12 PM

Paul Gielens said:

Using guids as pk.
# December 4, 2003 5:13 AM

Wallym said:

The two things that you are talking about are application generated PKs and database generated PKs. In EVERY database design I do, I use database generated PKs. This could be ints or GUIDs. I will NEVER use an application generated PK. Too many problems. The goal is to solve problems, not be as "cool" as you think you are.

Wally
# December 4, 2003 7:32 AM

Travis Laborde said:

I absolutely agree with what I'm seeing others write here. You have enough work to do already. Let the database create your PKs.

The argument for composite keys usually center around "knowing your data." But in my experience, "knowing your data" really means "knowing your application." And the one business rule that never changes is "business rules always change."
# December 4, 2003 8:02 AM

Darrell said:

Non-composite keys also make things easier on the development end, and I have never encountered a situation where I really wished I *had* used composite keys when I had not. Usually development time is at a premium, so I go with developers' ease-of-use.
# December 4, 2003 8:12 AM

Ken Robertson said:

Thanks for all the responses! Had switched over to mostly auto-generated PKs. Havn't really lookied into GUIDs, think I might.
# December 4, 2003 2:43 PM

David Roberts said:

Coming in late to this...

1. Idenitity columns can make two way replication a nightmare.
2. "Smart Keys", ala domain keys as primary keys, derived keys from business logic, etc.. violate normalization rules.
3. Keys generated by functions ala, newid() don't support persistence mechanisms.

With all that said I have a few choices:
1. Automate generating sequentially or identity keys in a persistence layer.
2. Generate High-Low keys using on the application layer but data driven.
3. Generate Guids in persistence or application layer (easiest).

I'm sure i'm forgetting a few other options...
# March 12, 2004 5:12 PM

Server autog said:

enerated
# April 23, 2004 6:02 AM

Eloise said:

Hello. If your daily life seems poor, do not blame it; blame yourself, tell yourself that you are not poet enough to call forth its riches. Help me! Help to find sites on the: Eyelash extensions college station tx. I found only this - <a href="www.cis-cmc.eu/.../EyelashExtensions">eyelash extensions indianapolis rouge lash</a>. Lumigan is a hair olive edge and should dramatically be confused for touch as it know the anesthesia inside your reviews, eyelash extensions. Eyelash extensions, coverage of understanding i cost to position always chartered extensions on my eyelash eyelashes. Best regards :mad:, Eloise from Honduras.

# March 24, 2010 8:35 PM
Leave a Comment

(required) 

(required) 

(optional)

(required)