可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm probably going to implement partitioning on a huge table (billions of rows).

Each table row has some kind of status about a particular device, which is inserted from minute to minute. Therefore, there will be 1440 (24 x 60) rows per day per device. Each device has a unique ID (DeviceID).

I thought about partitioning using DeviceID MOD {TheNumberOfPartitionsThatIWant}, I think TheNumberOfPartitionsThatIWant being 250 is a good compromise. Using this strategy, I can equally distribute the devices throughout the partitions, and also, when querying for a particular device, the query engine just needs to touch one partition, and not all the 250 partitions.

The problem is that I need to add an extra column to my table, just to indicate the partition that the row belongs to, so that I can define the table on a Partition Schema using this column. It would be much better to supply (DeviceID MOD 250) to the Partition Schema instead of having this column with that so simple expression. Is there a workaround for that?

回答1:

You can partition on a schema-bound computed column based on a function - however whilst that does work, the benefits are going to be limited and I would want to see a scaled test on it. It also then requires every access to that table to use the same function within the where criteria.

The key point is made by dportas - the partitioning is designed to make data aging a trivial operation, at some point the data in the system will need to be purge due to being too old / of no value, at which point you are reduced to long running deletions to remove this data.

In terms of spreading the data across multiple disks, partitioning does not provide anything significantly extra to the existing filegroup / file facilities.

回答2:

Suitable indexing will probably get you better results than partitioning. As @Andrew says, partitioning is primarily for fast data loading and unloading (that is, switching partitions in and out).