Invalid count and sum in cross tab query using Pos

2019-02-18 15:12发布

问题:

I am using PostgreSQL 9.3 version database.

I have a situation where I want to count the number of products sales and sum the amount of product and also want to show the cities in a column where the product have sale.

Example

Setup

create table products (
 name varchar(20),
 price integer,
 city varchar(20)
);

insert into products values
   ('P1',1200,'London'),
   ('P1',100,'Melborun'),
   ('P1',1400,'Moscow'),
   ('P2',1560,'Munich'),
   ('P2',2300,'Shunghai'),
   ('P2',3000,'Dubai');

Crosstab query:

select * from crosstab (
        'select  name,count(*),sum(price),city,count(city)
         from products
         group by  name,city
         order by  name,city
         '
        ,           
        'select distinct city from products order by 1'
     ) 
     as tb (
         name varchar(20),TotalSales bigint,TotalAmount bigint,London bigint,Melborun bigint,Moscow bigint,Munich bigint,Shunghai bigint,Dubai bigint
     );

Output

name    totalsales  totalamount     london     melborun    moscow      munich    shunghai    dubai  
---------------------------------------------------------------------------------------------------------
 P1         1           1200                       1          1            1 
 P2         1           3000          1                                               1         1

Expected Output:

name    totalsales  totalamount     london     melborun    moscow      munich    shunghai    dubai  
---------------------------------------------------------------------------------------------------------
 P1         3           2700           1          1           1               
 P2         3           6860                                               1          1         1        

回答1:

Your first mistake seems to be simple. According to the 2nd parameter of the crosstab() function, 'Dubai' must come as first city (sorted by city). Details:

  • PostgreSQL Crosstab Query

The unexpected values for totalsales and totalamount represent values from the first row for each name group. "Extra" columns are treated like that. Details:

  • Pivot on Multiple Columns using Tablefunc

To get sums per name, run window functions over your aggregate functions. Details:

  • Get the distinct sum of a joined table column

select * from crosstab (
   'select name
          ,sum(count(*))   OVER (PARTITION BY name)
          ,sum(sum(price)) OVER (PARTITION BY name)
          ,city
          ,count(city)
    from   products
    group  by name,city
    order  by name,city
    '
--  ,'select distinct city from products order by 1' -- replaced
    ,$$SELECT unnest('{Dubai,London,Melborun
                      ,Moscow,Munich,Shunghai}'::varchar[])$$
) AS tb (
    name varchar(20), TotalSales bigint, TotalAmount bigint
   ,Dubai bigint
   ,London bigint
   ,Melborun bigint
   ,Moscow bigint
   ,Munich bigint
   ,Shunghai bigint
   );

Better yet, provide a static set as 2nd parameter. Output columns are hard coded, it may be unreliable to generate data columns dynamically. If you a another row with a new city, this would break.
This way you can also order your columns as you like. Just keep output columns and 2nd parameter in sync.



回答2:

Honestly I think your database needs some drastic normalization and your results in several columns (one for each city name) is not something I would do myself. Nevertheless if you want to stick to it you can do it this way.

For the first step you need get the correct amounts. This would do the trick quite fast:

select name, count(1) totalsales, sum(price) totalAmount 
from products 
group by name;

This will be your result:

NAME    TOTALSALES  TOTALAMOUNT
P2      3           6860
P1      3           2700

You would get the Products/City this way:

select name, city, count(1) totalCityName 
from products 
group by name, city 
order by name, city;

This result:

NAME    CITY        TOTALCITYNAME
P1      London      1
P1      Melborun    1
P1      Moscow      1
P2      Dubai       1
P2      Munich      1
P2      Shunghai    1

If you really would like a column per city you could do something like:

select name,
count(1) totalsales, 
sum(price) totalAmount, 
(select count(1) 
    from Products a 
    where a.City = 'London' and a.name = p.name) London,
...
from products p 
group by name;

But I would not recommend it!!! This would be the result:

NAME    TOTALSALES  TOTALAMOUNT LONDON ...
P1      3           2700        1
P2      3           6860        0

Demonstration here.