I have some data which has ValidFrom
and ValidTo
dates associated with it. In simple terms:
MembershipId | ValidFromDate | ValidToDate
==========================================
0001 | 1997-01-01 | 2006-05-09
0002 | 1997-01-01 | 2017-05-12
0003 | 2005-06-02 | 2009-02-07
There is a non-clustered index on this table which includes the two dates as key values.
I also have a Date dimension table which covers every date from 1900
to 2999
.
I'm trying to figure out how I can select a range of dates from the Date dimension table (let's say 2016-01-01
to 2016-12-31
) and then identify, for each date, how many memberships were valid on that date.
The code below does the job but the performance isn't great and I was wondering whether anyone has any recommendations for a better way to go about this?
SELECT
d.DateKey
,(SELECT COUNT(*) FROM Memberships AS m
WHERE d.DateKey between m.ValidFromDateKey and m.ValidToDateKey
) AS MembershipCount
FROM
DIM.[Date] AS d
WHERE
d.CalendarYear = 2016
Thanks in advance for any suggestions!
The logic in your SQL is mostly correct, you have just implemented it poorly for how SQL likes to do things. Starting with your
Dates
table as you have done already, rather than doing a sub-select for each row of data, change your logic to ajoin
and you are there:What you may want to be careful of is identifying which memberships are to be counted on each day. For example, if your date is
2006-05-09
should MembershipID0001
be included as it ends that day?The question is essentially, are you counting the number of Memberships that were active at any point during the entire day, or just those that were active at a particular time, say the start or the end of the day?
Then repeat this thought process for your
ValidFromDate
values.