我创建了使用SQL Server 2005(可能的SQL Server 2008在不久的将来)一个网站一个新的数据库。 作为应用程序开发,我已经看到了使用多个数据库integer
(或bigint
等),用于将被用于关系的表的ID字段。 但最近我也看到使用该数据库unique identifier
( GUID
)一个ID字段。
我的问题是,一个人是否拥有其他优势? 将integer
场是查询和连接等快?
UPDATE:要清楚,这是为表中的一个主键。
我创建了使用SQL Server 2005(可能的SQL Server 2008在不久的将来)一个网站一个新的数据库。 作为应用程序开发,我已经看到了使用多个数据库integer
(或bigint
等),用于将被用于关系的表的ID字段。 但最近我也看到使用该数据库unique identifier
( GUID
)一个ID字段。
我的问题是,一个人是否拥有其他优势? 将integer
场是查询和连接等快?
UPDATE:要清楚,这是为表中的一个主键。
GUIDs are problematic as clustered keys because of the high randomness. This issue was addressed by Paul Randal in the last Technet Magazine Q&A column: I'd like to use a GUID as the clustered index key, but the others are arguing that it can lead to performance issues with indexes. Is this true and, if so, can you explain why?
Now bear in mind that the discussion is specifically about clustered indexes. You say you want to use the column as 'ID', that is unclear if you mean it as clustered key or just primary key. Typically the two overlap, so I'll assume you want to use it as clustered index. The reasons why that is a poor choice are explained in the link to the article I mentioned above.
For non clustered indexes GUIDs still have some issues, but not nearly as big as when they are the leftmost clustered key of the table. Again, the randomness of GUIDs introduces page splits and fragmentation, be it at the non-clustered index level only (a much smaller problem).
There are many urban legends surrounding the GUID usage that condemn them based on their size (16 bytes) compared to an int (4 bytes) and promise horrible performance doom if they are used. This is slightly exaggerated. A key of size 16 can be a very peformant key still, on a properly designed data model. While is true that being 4 times as big as a int results in more a lower density non-leaf pages in indexes, this is not a real concern for the vast majority of tables. The b-tree structure is a naturally well balanced tree and the depth of tree traversal is seldom an issue, so seeking a value based on GUID key as opposed to a INT key is similar in performance. A leaf-page traversal (ie. a table scan) does not look at the non-leaf pages, and the impact of GUID size on the page size is typically quite small, as the record itself is significantly larger than the extra 12 bytes introduced by the GUID. So I'd take the hear-say advice based on 'is 16 bytes vs. 4' with a, rather large, grain of salt. Analyze on individual case by case and decide if the size impact makes a real difference: how many other columns are in the table (ie. how much impact has the GUID size on the leaf pages) and how many references are using it (ie. how many other tables will increase because of the fact they need to store a larger foreign key).
I'm calling out all these details in a sort of makeshift defense of GUIDs because they been getting a lot of bad press lately and some is undeserved. They have their merits and are indispensable in any distributed system (the moment you're talking data movement, be it via replication or sync framework or whatever). I've seen bad decisions being made out based on the GUID bad reputation when they were shun without proper consideration. But is true, if you have to use a GUID as clustered key, make sure you address the randomness issue: use sequential guids when possible.
And finally, to answer your question: if you don't have a specific reason to use GUIDs, use INTs.
该GUID是要占用更多的空间,并比int慢 - 即使你使用NEWSEQUENTIALID()函数。 如果你打算做复制或使用同步框架,你几乎必须使用GUID。
int为4个字节,BIGINTs AR 8个字节,和GUID是16个字节。 来表示数据所需的更多的空间,需要更多的资源来处理它 - 磁盘空间,内存等,所以(一)他们是慢,但(b)这可能的事项,如果量是一个问题(百万行,或数以千计的交易非常,非常短的时间。)
的GUID的好处是,他们(几乎)全球唯一的。 生成使用正确的算法(和SQL Server XXXX将使用适当的算法)一个GUID,并没有两个的GUID将永远是相似的 - 不管你有多少台计算机生成它们,无论多么频繁。 (这并不之后72年使用的应用 - 我忘了细节。)
如果你需要在多个服务器上生成的唯一标识符的GUID可能是有用的。 如果您需要盟perforance并在2个十亿值,整数可能是罚款。 最后,也许是最重要的,如果你的数据有自然键,坚持与他们忘记了替代价值。
如果你积极的,绝对必须有一个唯一的ID,然后GUID。 这意味着如果你曾经要合并,同步,复制,你应该使用GUID。
对于不太可靠的东西,一个int,应该足够了,这取决于该表将有多大增长。
在大多数情况下,正确的答案是,这取决于。
使用它们复制等,而不是主键。
金佰利大号特里普文章
与JBrooks完全同意。 我想说的是,当你的表是大,你用选择与连接,特别是与派生表,使用的GUID可以significally降低性能。