I realize this question is very likely to have been asked before, but I've searched around a little among questions on StackOverflow, and I didn't really find an answer to mine, so here goes. If you find a duplicate, please link to it.
For some reason I prefer to use Guid
s (uniqueidentifier
in MsSql) for my primary key fields, but I really don't know why this would be better. In many of tutorials I've walked myself through lately an automatically incremented int
has been used. I can see pro's and cons with both:
- A
Guid
is always of the same size and length, and there is no reason to worry about running out of them, whereas there is a limit to how many records you could have before you'd run out of numbers that fit in anint
. int
is (at least in C#) a nullable type, which opens for a couple of shortcuts when querying for data.- And
int
is easier to read. - I bet you could come up with at least a couple of more things here.
So, as simple as the title says it: What is the recommended data type for ID (primary key) columns in a database?
EDIT: After recieving a couple of short answer, I must also add this follow-up question. Without it, your answer is neither compelling nor educating... ;) Why do you think so, and what are the cons of the other option that make you not choose that instead?
Popular databases allow for larger autoincrement fields for years now, so it's much less of an issue.
As for what to use, it's always a choice. One is not clearly better than the other, they have different characteristics and each is good in different scenarios. I have used both over time, and the next schema I work with I'll consider both.
Pros for GUID:
Pros for autoincrement:
I asked a similar question which has a few answers that might help. Replication seems to be the biggest advantage of using GUIDs.
Reasons not to use an auto-incrementing number for a primary key
Follow Cletus's advice, with the additional caveat of it largely depends on what your storting. Never, ever, use a GUID. GUID's have a whole bundle of downsides, and only one or two upsides.
If you use a long, you could create over 1000 a second and not run out of primary keys for 29 million years.
Others have already mentioned some of the advantages of using an integer type instead of a UUID/GUID. One of the big advantages is the speed and compactness of the indexes.
An application I was recently involved in where I did the database design, I needed UUIDs, but didn't want to give up the advantages of using longs for primary keys, so I had a "allIds" table that mapped every primary key in the system to a UUID. All my primary keys were generated from a single sequence, so they were all unique across all tables.
If the database is distributed, where you could get records from other databases, the primary key needs to be unique within a table across all the databases. GUID solves this issue, albeit at the cost of space. A combination of autoincrement and namespace would be a good tradeoff.
It would be nice if databases could provide inbuild support for autoincrements with "prefixes". So in one database, I get IDs like X1,X2,X3 ... and so on whereas in the other database it could be Y1,Y2,Y3 ... and so on.
A big disadvantage of using GUID keys is that it is difficult to perform "ad-hoc" queries by hand. Sometimes it is very useful that you can do this:
SELECT * FROM User where UserID=452245
With GUID keys this can become very annoying.
I would recommend 64 bit integers