Sql server which is better “Rank” or “Left join wi

I have the following table in database with around 10 millions records (it will increase in future may be double in 1 year):

create table PropertyOwners (
    [Key] int not null primary key,
    PropertyKey int not null, 
    BoughtDate DateTime, 
    OwnerKey int not null
)
go

Above table contains all the property owned by an owner at certain time, I want to get the owners which owns more than certain amount of properties at a current time, lets say more then 1000 properties at a time. I have written two different queries one using "Rank" and other using "Left join with own table".

Using Rank (Taking around 4 sec):

select OwnerKey, COUNT(1) PropertyCount 
from (
    select PropertyKey, OwnerKey, BoughtDate,
        RANK() over (partition by PropertyKey order by BoughtDate desc) as [Rank]
    from dbo.PropertyOwners 
) result
where [Rank]=1
group by OwnerKey
having COUNT(1)>1000

Using left join with same table (Taking around 10sec):

select OwnerKey, COUNT(1) PropertyCount 
from (
    select po.PropertyKey, po.OwnerKey, po.BoughtDate
    from dbo.PropertyOwners po
    left join dbo.PropertyOwners lo on lo.PropertyKey = po.PropertyKey
    and lo.BoughtDate > po.BoughtDate
    where lo.PropertyKey is null
) result
group by OwnerKey
having COUNT(1)>1000

Both of the query times are unacceptable as taking so much time, can anyone help me with the query to rewrite. My table has following index:

CREATE NONCLUSTERED INDEX [IX_PropertyKey_BounghtDate] ON [dbo].[PropertyOwners] 
(
    [PropertyKey] ASC,
    [BoughtDate] DESC
)
INCLUDE ( [OwnerKey]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

标签： sql sql-server sql-server-2008

3条回答

再贱就再见

2楼-- · 2019-09-03 19:48

Grouping is never fast. You may look into indices that SQL Server will suggest if you run your queries sufficient amount of times; google diagnostic queries that involve sys.dm_db_index_usage_stats, they will be of some help.

Another option, already suggested before, is building a summary table. Slightly more lightweight solution will be an indexed view, but you have to understand implications that will come into play if you will create it.

0人赞添加讨论(0) 举报

\"骚年 ilove

3楼-- · 2019-09-03 19:52

You have a fair amount of data and a lot that has to be counted. This analysis by Aaron Bertrand isn't exactly your problem, but it might help you.

With your supporting index, I would recommend trying the not exists approach:

select OwnerKey, count(*) as PropertyCount
from PropertyOwners po
where not exists (select 1
                  from PropertyOwners po2
                  where po2.PropertyKey = po.PropertyKey and
                        po2.BoughtDate > po.BoughtDate
                 )
group by OwnerKey
having count(*) > 1000;

If you can't get the query to work sufficiently fast, you may need to either upgrade your hardware or use triggers to keep a summary table up-to-date.

0人赞添加讨论(0) 举报

Animai°情兽

4楼-- · 2019-09-03 19:59

You could rewrite this as (which might improve performance)

select OwnerKey, COUNT(1) PropertyCount 
from (
    select PropertyKey, MAX( BoughtDate) BoughtDate
    from dbo.PropertyOwners 
    Group by PropertyKey
) result INNER JOIN dbo.PropertyOwners po ON po.PropertyKey=result.PropertyKey and PO.boughtDate=result.boughtdate
group by OwnerKey
having COUNT(1)>1000

0人赞添加讨论(0) 举报

Sql server which is better “Rank” or “Left join wi

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间