What alternatives exist to using guid as clustered

2019-05-29 11:31发布

问题:

I am currently working on a project that entails using a SQL Azure database to store data for a forthcoming application. One of the goals of the project is to be able to take advantage of Federations (sharding) in SQL Azure. Another stated goal of the project is to be able to run this application on local hardware should a client opt for this scenario.

One of the "obstacles" I am confronted with is the lack of support for IDENTITY in Federations.

http://blogs.msdn.com/b/cbiyikoglu/archive/2011/06/20/id-generation-in-federations-identity-sequences-and-guids-uniqueidentifier.aspx

While I understand why IDENTITY is not supported in Azure, I seem to have a mental roadblock of accepting that using a GUID as a clustered index is a good idea.

Clustered and nonclustered indexes performance

I've executed the sample tests in the first link above and have confirmed that there is little difference in performance on Azure when inserting records into a table with a guid as the clustered index vs. inserting the same amount of records into a table with an int identity field serving as the clustered index.

However, as I also need to support an on-premise installation, I think it is a safe statement to say that performance locally will suffer when using guids as the clustered index instead of using int identity.

In addition to performance related concerns, I also am concerned about the use of a 16 byte wide guid as the clustered index vs. using a 4 byte wide integer as the clustered index. Sure, disk space is relatively cheap, but this still adds up fairly quickly (and perhaps unnecessarily so).

I realize that I will ultimately have to make a trade-off based on needing to support both of these stated project goals, but I'm looking to make the most informed decision I can make.

Aside from using a middle tier id generator, what alternatives are there to using a Guid as the clustered index (and \ or primary key) on Azure while still working with Federations?

Alternatively, are my concerns regarding using a Guid as the clustered Index off-base (which I admit they very well could be)? If so, why?

回答1:

You are right to bother about the 16 byte Guid vs 4 byte Int (I also do so). However take the bright side - Federations give you the ability to massively scale the DB layer.

Now to the question - as per Online Documentation, the Federation Distribution type can only be one of:

INT, BIGINT, UNIQUEIDENTIFIER or VARBINARY(n), where n can be a maximum of 900

So I see the UNIQUEIDENTITIFIER as really the only option for your app.

Having any kind of id generator at application level would introduce a single point of failure / single point of "crash" or a bottleneck, which obviously SQL Azure Federations is trying to cover. So I would not use any ID generation logic, rather then Guid.NewGuid() at application side, or NEWID() at DB side.

As for performance impacts of UNIQUEIDENTIFIER for on-premises solution, I can't talk, and this is a separate question.



回答2:

Apparently using GUID with SQL Azure will not incur the same kind of performance issues an on premise SQL Server installation will.

many people have experiences that suggest GUIDs (uniqueidentifiers) are bad candidates for clustering keys given they will not be ordered and cause page splits, causing higher latencies and fragmentation? No so on SQL Azure

http://blogs.msdn.com/b/cbiyikoglu/archive/2012/05/17/id-generation-in-federations-identity-sequences-and-guids-uniqueidentifier.aspx

So I would probably go with Guid's