SQL query to compare product sales by month

2020-07-18 04:47发布

问题:

I have a Monthly Status database view I need to build a report based on. The data in the view looks something like this:

Category | Revenue  |  Yearh  |  Month
Bikes      10 000      2008        1
Bikes      12 000      2008        2
Bikes      12 000      2008        3
Bikes      15 000      2008        1
Bikes      11 000      2007        2
Bikes      11 500      2007        3
Bikes      15 400      2007        4


... And so forth

The view has a product category, a revenue, a year and a month. I want to create a report comparing 2007 and 2008, showing 0 for the months with no sales. So the report should look something like this:

Category  |  Month  |  Rev. This Year  |  Rev. Last Year
Bikes          1          10 000               0
Bikes          2          12 000               11 000
Bikes          3          12 000               11 500
Bikes          4          0                    15 400


The key thing to notice is how month 1 only has sales in 2008, and therefore is 0 for 2007. Also, month 4 only has no sales in 2008, hence the 0, while it has sales in 2007 and still show up.

Also, the report is actually for financial year - so I would love to have empty columns with 0 in both if there was no sales in say month 5 for either 2007 or 2008.

The query I got looks something like this:

SELECT 
    SP1.Program,
    SP1.Year,
    SP1.Month,
    SP1.TotalRevenue,
    IsNull(SP2.TotalRevenue, 0) AS LastYearTotalRevenue

FROM PVMonthlyStatusReport AS SP1 
     LEFT OUTER JOIN PVMonthlyStatusReport AS SP2 ON 
                SP1.Program = SP2.Program AND 
                SP2.Year = SP1.Year - 1 AND 
                SP1.Month = SP2.Month
WHERE 
    SP1.Program = 'Bikes' AND
    SP1.Category = @Category AND 
    (SP1.Year >= @FinancialYear AND SP1.Year <= @FinancialYear + 1) AND
    ((SP1.Year = @FinancialYear AND SP1.Month > 6) OR 
     (SP1.Year = @FinancialYear + 1 AND SP1.Month <= 6))

ORDER BY SP1.Year, SP1.Month

The problem with this query is that it would not return the fourth row in my example data above, since we didn't have any sales in 2008, but we actually did in 2007.

This is probably a common query/problem, but my SQL is rusty after doing front-end development for so long. Any help is greatly appreciated!

Oh, btw, I'm using SQL 2005 for this query so if there are any helpful new features that might help me let me know.

回答1:

The Case Statement is my best sql friend. You also need a table for time to generate your 0 rev in both months.

Assumptions are based on the availability of following tables:

sales: Category | Revenue | Yearh | Month

and

tm: Year | Month (populated with all dates required for reporting)

Example 1 without empty rows:

select
    Category
    ,month
    ,SUM(CASE WHEN YEAR = 2008 THEN Revenue ELSE 0 END) this_year
    ,SUM(CASE WHEN YEAR = 2007 THEN Revenue ELSE 0 END) last_year

from
    sales

where
    year in (2008,2007)

group by
    Category
    ,month

RETURNS:

Category  |  Month  |  Rev. This Year  |  Rev. Last Year
Bikes          1          10 000               0
Bikes          2          12 000               11 000
Bikes          3          12 000               11 500
Bikes          4          0                    15 400

Example 2 with empty rows: I am going to use a sub query (but others may not) and will return an empty row for every product and year month combo.

select
    fill.Category
    ,fill.month
    ,SUM(CASE WHEN YEAR = 2008 THEN Revenue ELSE 0 END) this_year
    ,SUM(CASE WHEN YEAR = 2007 THEN Revenue ELSE 0 END) last_year

from
    sales
    Right join (select distinct  --try out left, right and cross joins to test results.
                   product
                   ,year
                   ,month
               from
                  sales --this ideally would be from a products table
                  cross join tm
               where
                    year in (2008,2007)) fill


where
    fill.year in (2008,2007)

group by
    fill.Category
    ,fill.month

RETURNS:

Category  |  Month  |  Rev. This Year  |  Rev. Last Year
Bikes          1          10 000               0
Bikes          2          12 000               11 000
Bikes          3          12 000               11 500
Bikes          4          0                    15 400
Bikes          5          0                    0
Bikes          6          0                    0
Bikes          7          0                    0
Bikes          8          0                    0

Note that most reporting tools will do this crosstab or matrix functionality, and now that i think of it SQL Server 2005 has pivot syntax that will do this as well.

Here are some additional resources. CASE http://www.4guysfromrolla.com/webtech/102704-1.shtml SQL SERVER 2005 PIVOT http://msdn.microsoft.com/en-us/library/ms177410.aspx



回答2:

@Christian -- markdown editor -- UGH; especially when the preview and the final version of your post disagree... @Christian -- full outer join -- the full outer join is overruled by the fact that there are references to SP1 in the WHERE clause, and the WHERE clause is applied after the JOIN. To do a full outer join with filtering on one of the tables, you need to put your WHERE clause into a subquery, so the filtering happens before the join, or try to build all of your WHERE criteria onto the JOIN ON clause, which is insanely ugly. Well, there's actually no pretty way to do this one.

@Jonas: Considering this:

Also, the report is actually for financial year - so I would love to have empty columns with 0 in both if there was no sales in say month 5 for either 2007 or 2008.

and the fact that this job can't be done with a pretty query, I would definitely try to get the results you actually want. No point in having an ugly query and not even getting the exact data you actually want. ;)

So, I'd suggest doing this in 5 steps:
1. create a temp table in the format you want your results to match
2. populate it with twelve rows, with 1-12 in the month column
3. update the "This Year" column using your SP1 logic
4. update the "Last Year" column using your SP2 logic
5. select from the temp table

Of course, I guess I'm working from the assumption that you can create a stored procedure to accomplish this. You might technically be able to run this whole batch inline, but that kind of ugliness is very rarely seen. If you can't make an SP, I suggest you fall back on the full outer join via subquery, but it won't get you a row when a month had no sales either year.



回答3:

About the markdown - Yeah that is frustrating. The editor did preview my HTML table, but after posting it was gone - So had to remove all HTML formatting from the post...

@kcrumley I think we've reached similar conclusions. This query easily gets real ugly. I actually solved this before reading your answer, using a similar (but yet different approach). I have access to create stored procedures and functions on the reporting database. I created a Table Valued function accepting a product category and a financial year as the parameter. Based on that the function will populate a table containing 12 rows. The rows will be populated with data from the view if any sales available, if not the row will have 0 values.

I then join the two tables returned by the functions. Since I know all tables will have twelve roves it's allot easier, and I can join on Product Category and Month:

SELECT 
    SP1.Program,
    SP1.Year,
    SP1.Month,
    SP1.TotalRevenue AS ThisYearRevenue,
    SP2.TotalRevenue AS LastYearRevenue
FROM GetFinancialYear(@Category, 'First Look',  2008) AS SP1 
     RIGHT JOIN GetFinancialYear(@Category, 'First Look',  2007) AS SP2 ON 
         SP1.Program = SP2.Program AND 
         SP1.Month = SP2.Month

I think your approach is probably a little cleaner as the GetFinancialYear function is quite messy! But at least it works - which makes me happy for now ;)



回答4:

I could be wrong but shouldn't you be using a full outer join instead of just a left join? That way you will be getting 'empty' columns from both tables.

http://en.wikipedia.org/wiki/Join_(SQL)#Full_outer_join



回答5:

The trick is to do a FULL JOIN, with ISNULL's to get the joined columns from either table. I usually wrap this into a view or derived table, otherwise you need to use ISNULL in the WHERE clause as well.

SELECT 
    Program,
    Month,
    ThisYearTotalRevenue,
    PriorYearTotalRevenue
FROM (
    SELECT 
        ISNULL(ThisYear.Program, PriorYear.Program) as Program,
        ISNULL(ThisYear.Month, PriorYear.Month),
        ISNULL(ThisYear.TotalRevenue, 0) as ThisYearTotalRevenue,
        ISNULL(PriorYear.TotalRevenue, 0) as PriorYearTotalRevenue
    FROM (
        SELECT Program, Month, SUM(TotalRevenue) as TotalRevenue 
        FROM PVMonthlyStatusReport 
        WHERE Year = @FinancialYear 
        GROUP BY Program, Month
    ) as ThisYear 
    FULL OUTER JOIN (
        SELECT Program, Month, SUM(TotalRevenue) as TotalRevenue 
        FROM PVMonthlyStatusReport 
        WHERE Year = (@FinancialYear - 1) 
        GROUP BY Program, Month
    ) as PriorYear ON
        ThisYear.Program = PriorYear.Program
        AND ThisYear.Month = PriorYear.Month
) as Revenue
WHERE 
    Program = 'Bikes'
ORDER BY 
    Month

That should get you your minimum requirements - rows with sales in either 2007 or 2008, or both. To get rows with no sales in either year, you just need to INNER JOIN to a 1-12 numbers table (you do have one of those, don't you?).



回答6:

Using pivot and Dynamic Sql we can achieve this result

SET NOCOUNT ON
IF OBJECT_ID('TEMPDB..#TEMP') IS NOT NULL
DROP TABLE #TEMP

;With cte(Category , Revenue  ,  Yearh  ,  [Month])
AS
(
SELECT 'Bikes', 10000, 2008,1 UNION ALL
SELECT 'Bikes', 12000, 2008,2 UNION ALL
SELECT 'Bikes', 12000, 2008,3 UNION ALL
SELECT 'Bikes', 15000, 2008,1 UNION ALL
SELECT 'Bikes', 11000, 2007,2 UNION ALL
SELECT 'Bikes', 11500, 2007,3 UNION ALL
SELECT 'Bikes', 15400, 2007,4
)
SELECT * INTO #Temp FROM cte

Declare @Column nvarchar(max),
        @Column2 nvarchar(max),
        @Sql nvarchar(max)


SELECT @Column=STUFF((SELECT DISTINCT ','+ 'ISNULL('+QUOTENAME(CAST(Yearh AS VArchar(10)))+','+'''0'''+')'+ 'AS '+ QUOTENAME(CAST(Yearh AS VArchar(10)))
FROM #Temp order by 1 desc FOR XML PATH ('')),1,1,'')

SELECT @Column2=STUFF((SELECT DISTINCT ','+ QUOTENAME(CAST(Yearh AS VArchar(10)))
FROM #Temp FOR XML PATH ('')),1,1,'')

SET @Sql= N'SELECT Category,[Month],'+ @Column +'FRom #Temp
            PIVOT
            (MIN(Revenue) FOR yearh IN ('+@Column2+')
            ) AS Pvt

            '
EXEC(@Sql)
Print @Sql

Result

Category    Month   2008    2007
----------------------------------
Bikes       1       10000   0
Bikes       2       12000   11000
Bikes       3       12000   11500
Bikes       4       0       15400