T-SQL - Track occurrences over time

2019-05-12 03:35发布

问题:

I have some data which has ValidFrom and ValidTo dates associated with it. In simple terms:

MembershipId | ValidFromDate | ValidToDate
==========================================
0001         | 1997-01-01    | 2006-05-09
0002         | 1997-01-01    | 2017-05-12
0003         | 2005-06-02    | 2009-02-07

There is a non-clustered index on this table which includes the two dates as key values.

I also have a Date dimension table which covers every date from 1900 to 2999.

I'm trying to figure out how I can select a range of dates from the Date dimension table (let's say 2016-01-01 to 2016-12-31) and then identify, for each date, how many memberships were valid on that date.

The code below does the job but the performance isn't great and I was wondering whether anyone has any recommendations for a better way to go about this?

SELECT 
   d.DateKey
  ,(SELECT COUNT(*) FROM Memberships AS m
    WHERE d.DateKey between m.ValidFromDateKey and m.ValidToDateKey
    ) AS MembershipCount

FROM       
   DIM.[Date] AS d

WHERE
   d.CalendarYear = 2016

Thanks in advance for any suggestions!

回答1:

The logic in your SQL is mostly correct, you have just implemented it poorly for how SQL likes to do things. Starting with your Dates table as you have done already, rather than doing a sub-select for each row of data, change your logic to a join and you are there:

select d.DateKey
      ,count(m.MembershipID) as MembershipCount
from DIM.[Date] as d
    left join Memberships as m
        on(d.DateKey between m.ValidFromDateKey and m.ValidToDateKey)
where d.CalendarYear = 2016
group by d.DateKey
order by d.DateKey;

What you may want to be careful of is identifying which memberships are to be counted on each day. For example, if your date is 2006-05-09 should MembershipID 0001 be included as it ends that day?

The question is essentially, are you counting the number of Memberships that were active at any point during the entire day, or just those that were active at a particular time, say the start or the end of the day?

Then repeat this thought process for your ValidFromDate values.