I have the following data set:
Item || Date || Client ID || Date difference
A || 12/12/2014 || 102 ||
A || 13/12/2014 || 102 || 1
B || 12/12/2014 || 141 ||
B || 17/12/2014 || 141 || 5
I would like to calculate the difference in years between the two dates when the client ID is the same. What expression can I use in a calculated column to get that value?
UPDATE
Hi
This would be the intended values calculated. My table has approximately 300,000 records in no particular order. Would I have to sort the physical table before using this formula? I used this example from another I found, my actual file has no item column. It is only the client ID, and date of the transaction. Thanks again for the help!
ClientId Date Days
102 2014.12.12 0
102 2014.12.13 1
141 2014.12.12 0
141 2014.12.17 5
123 2014.12.01 0
123 2014.12.02 1
123 2014.12.04 2
EDIT 2015.07.15
got it, so if you want the difference from the last customer-date pair. this expression will give you the table you've listed above. spacing for readability:
DateDiff('day',
First([Date) OVER (Intersect([ClientId], Previous([Date]))),
[Date]
)
EDIT 2015.07.13
if you want to reduce this so that you can accurately aggregate [Days]
, you can surround the above expression with an If()
. I'll add some spacing to make this more readable:
If(
[Date] = Min([Date]) OVER Intersect([ClientId], [Item]),
DateDiff( 'day',
Min([Date]) OVER Intersect([ClientId], [Item]),
Max([Date]) OVER Intersect([ClientId], [Item])
)
, 0
)
in English: "If the value of the [Date] column in this row matches the earliest date for this [ItemId] and [ClientId] combination, then put the number of days difference between the first and last [Date] for this [ItemId] and [ClientId] combination; otherwise, put zero."
it results in something like:
Item ClientId Date Days
A 102 2014.12.12 1
A 102 2014.12.13 0
B 141 2014.12.12 5
B 141 2014.12.17 0
C 123 2014.12.01 2
C 123 2014.12.02 0
C 123 2014.12.03 0
WARNING that filters may break this calculation. for example, if you are filtering based on [Date] and, with the above table as an example, filter OUT all dates before 2014.12.13, Sum([Date]) will be 7 instead of 8 (because the first row has been filtered out).
you can use Spotfire's OVER
functions to look at data points with common IDs across rows.
it looks like you've only got two rows per Client ID and Item ID, which helps us out! use the following formula:
DateDiff('day', Min([Date]) OVER Intersect([ClientId], [Item]), Max([Date]) OVER Intersect([ClientId], [Item]))
this will give you a column with the number of days difference between the two dates in each row:
Item ClientId Date Days
A 102 2014.12.12 1
A 102 2014.12.13 1
B 141 2014.12.12 5
B 141 2014.12.17 5