Good morning,
I am trying to calculate a 12 month moving average cost (MAUC) for each item in a particular warehouse. I am using the 2012_B – paging trick to calculate the median price (http://sqlperformance.com/2012/08/t-sql-queries/median) instead of using AVG in order to remove the potential for outliers to skew the result.
The following code works, however it only calculates the MAUC for one item or all items - depending on whether I remove or retain "AND t_item = 'xxxxx'
WITH Emily AS
(SELECT
t_item AS [Item Code]
,t_mauc_1 AS [MAUC]
FROM twhina113100
WHERE t_cwar = '11'
AND t_item = ' TNC-C2050NP-G'
AND t_trdt > GETDATE()-365)
(SELECT
AVG(1.0 * [Valuation Table].[MAUC])
FROM (
SELECT [MAUC] FROM Emily
ORDER BY [Emily].[MAUC]
OFFSET ((SELECT COUNT(*) FROM Emily) - 1) / 2 ROWS
FETCH NEXT 1 + (1 - (SELECT COUNT(*) FROM Emily) % 2) ROWS ONLY
) AS [Valuation Table] )
I believe that using Over(Partition By) may help me to partition by t_item however I am at a loss as to where to insert it into the code. I am quite new to SQL and my lack of formal training is starting to show.
If you have any other suggestions please share.
Any help would be much appreciated!
This one caught my attention, so I'm posting two options:
The first is a straight cte approach, and the second uses temp tables. The cte approach is fine for smaller data sets, but performance suffers as the series expands.
Both options will calculate the RUNNING Min, Max, Mean, Median, and Mode for a data series
Just a couple of items before we get into it. The normalized structure is ID and Measure.
- The ID could be a date or identity.
- The Measure is any numeric value
- Median is the mid-value of the sorted series. If an even number of observations we return the average of the two middle records
- Mode is represented as ModeR1 and ModeR2. If no repeated values, we show the min/max range
OK, let's take a look at the cte Approach
Declare @Table table (ID Int,Measure decimal(9,2))
Insert into @Table (ID,Measure) values
(1,25),
(2,75),
(3,50),
(4,25),
(5,12),
(6,66),
(7,45)
;with cteBase as (Select *,RowNr = Row_Number() over (Order By ID) From @Table),
cteExpd as (Select A.*,Measure2 = B.Measure,ExtRowNr = Row_Number() over (Partition By A.ID Order By B.Measure) From cteBase A Join cteBase B on (B.RowNr<=A.RowNr)),
cteMean as (Select ID,Mean=Avg(Measure2),Rows=Count(*) From cteExpd Group By ID),
cteMedn as (Select ID,MedRow1=ceiling(Rows/2.0),MedRow2=ceiling((Rows+1)/2.0) From cteMean),
cteMode as (Select ID,Mode=Measure2,ModeHits=count(*),ModeRowNr=Row_Number() over (Partition By ID Order By Count(*) Desc) From cteExpd Group By ID,Measure2)
Select A.ID
,A.Measure
,MinVal = min(Measure2)
,MaxVal = max(Measure2)
,Mean = max(B.Mean)
,Median = isnull(Avg(IIF(ExtRowNr between MedRow1 and MedRow2,Measure2,null)),A.Measure)
,ModeR1 = isnull(max(IIf(ModeHits>1,D.Mode,null)),min(Measure2))
,ModeR2 = isnull(max(IIf(ModeHits>1,D.Mode,null)),max(Measure2))
From cteExpd A
Join cteMean B on (A.ID=B.ID)
Join cteMedn C on (A.ID=C.ID)
Join cteMode D on (A.ID=D.ID and ModeRowNr=1)
Group By A.ID
,A.Measure
Order By A.ID
Returns
ID Measure MinVal MaxVal Mean Median ModeR1 ModeR2
1 25.00 25.00 25.00 25.000000 25.000000 25.00 25.00
2 75.00 25.00 75.00 50.000000 50.000000 25.00 75.00
3 50.00 25.00 75.00 50.000000 50.000000 25.00 75.00
4 25.00 25.00 75.00 43.750000 37.500000 25.00 25.00
5 12.00 12.00 75.00 37.400000 25.000000 25.00 25.00
6 66.00 12.00 75.00 42.166666 37.500000 25.00 25.00
7 45.00 12.00 75.00 42.571428 45.000000 25.00 25.00
This cte approach is very light and fast for smaller data series
Now the Temp Table Approach
-- Generate Base Data -- Key ID and Key Measure
Select ID =TR_Date
,Measure=TR_Y10,RowNr = Row_Number() over (Order By TR_Date)
Into #Base
From [Chinrus-Series].[dbo].[DS_Treasury_Rates]
Where Year(TR_Date)>=2013
-- Extend Base Data one-to-many
Select A.*,Measure2 = B.Measure,ExtRowNr = Row_Number() over (Partition By A.ID Order By B.Measure) into #Expd From #Base A Join #Base B on (B.RowNr<=A.RowNr)
Create Index idx on #Expd (ID)
-- Generate Mean for Series
Select ID,Mean=Avg(Measure2),Rows=Count(*) into #Mean From #Expd Group By ID
Create Index idx on #Mean (ID)
-- Calculate Median Row Number(s) -- If even(avg of middle two rows)
Select ID,MednRow1=ceiling(Rows/2.0),MednRow2=ceiling((Rows+1)/2.0) into #Medn From #Mean
Create Index idx on #Medn (ID)
-- Calculate Mode
Select * into #Mode from (Select ID,Mode=Measure2,ModeHits=count(*),ModeRowNr=Row_Number() over (Partition By ID Order By Count(*) Desc,Measure2 Desc) From #Expd Group By ID,Measure2) A where ModeRowNr=1
Create Index idx on #Mode (ID)
-- Generate Final Results
Select A.ID
,A.Measure
,MinVal = min(Measure2)
,MaxVal = max(Measure2)
,Mean = max(B.Mean)
,Median = isnull(Avg(IIF(ExtRowNr between MednRow1 and MednRow2,Measure2,null)),A.Measure)
,ModeR1 = isnull(max(IIf(ModeHits>1,D.Mode,null)),min(Measure2))
,ModeR2 = isnull(max(IIf(ModeHits>1,D.Mode,null)),max(Measure2))
From #Expd A
Join #Mean B on (A.ID=B.ID)
Join #Medn C on (A.ID=C.ID)
Join #Mode D on (A.ID=D.ID and ModeRowNr=1)
Group By A.ID
,A.Measure
Order By A.ID
Returns
ID Measure MinVal MaxVal Mean Median ModeR1 ModeR2
2013-01-02 1.86 1.86 1.86 1.86 1.86 1.86 1.86
2013-01-03 1.92 1.86 1.92 1.89 1.89 1.86 1.92
2013-01-04 1.93 1.86 1.93 1.9033 1.92 1.86 1.93
2013-01-07 1.92 1.86 1.93 1.9075 1.92 1.92 1.92
2013-01-08 1.89 1.86 1.93 1.904 1.92 1.92 1.92
...
2016-07-20 1.59 1.37 3.04 2.2578 2.24 2.20 2.20
2016-07-21 1.57 1.37 3.04 2.257 2.235 2.61 2.61
2016-07-22 1.57 1.37 3.04 2.2562 2.23 2.20 2.20
Both approaches where validated in Excel
I should add that in the final query, you could certainly add/remove items like STD, Total