PostgreSQL crosstab doesn't work as desired

2019-08-19 07:49发布

In this example, I expect the resulting pivot table to have values for 4 columns, but instead there's only values for 2.

It should've returned something like this:

| time | trace1 | trace2 | trace3 | trace4 |
| -----------------------------------------|
|   t  |   v    |   v    |   v    |   v    |
|   t  |   v    |   v    |   v    |  null  |
|   t  |  null  |   v    |   v    |   v    |
|   t  |   v    |   v    |  null  |   v    |
|   t  |   v    |  null  |   v    |   v    |
|------------------------------------------|

but I got this instead:

| time | trace1 | trace2 | trace3 | trace4 |
| -----------------------------------------|
|   t  |   v    |   v    |  null  |  null  |
|   t  |   v    |   v    |  null  |  null  |
|   t  |   v    |   v    |  null  |  null  |
|   t  |   v    |  null  |  null  |  null  |
|   t  |   v    |  null  |  null  |  null  |
|------------------------------------------|

Even worse, if I remove

order by unixdatetime

, everything will be smashed into only 1 column as below:

| time | trace1 | trace2 | trace3 | trace4 |
| -----------------------------------------|
|   t  |   v    |  null  |  null  |  null  |
|   t  |   v    |  null  |  null  |  null  |
|   t  |   v    |  null  |  null  |  null  |
|   t  |   v    |  null  |  null  |  null  |
|   t  |   v    |  null  |  null  |  null  |
|------------------------------------------|

Here's the code:

select * 
from crosstab(
    $$
    select 
        unixdatetime, 
        gaugesummaryid, 
        value::double precision 
    from 
    (values
        (1546300800,187923,1.5),
        (1546387200,187923,1.5),
        (1546473600,187923,1.5),
        (1546560000,187923,1.75),
        (1546646400,187923,1.75),
        (1546732800,187923,1.75),
        (1546819200,187923,1.75),
        (1546905600,187923,1.5),
        (1546992000,187923,1.5),
        (1547078400,187923,1.5),
        (1547164800,187923,1.5),
        (1547337600,187924,200),
        (1547424000,187924,200),
        (1547510400,187924,200),
        (1547596800,187924,200),
        (1547683200,187924,200),
        (1547769600,187924,200),
        (1547856000,187924,200),
        (1547942400,187924,200),
        (1548028800,187924,200),
        (1548115200,187924,200),
        (1548201600,187924,200),
        (1548288000,187924,200),
        (1546300800,187926,120),
        (1546387200,187926,120),
        (1546473600,187926,120),
        (1546560000,187926,110),
        (1546646400,187926,110),
        (1546732800,187926,110),
        (1546819200,187926,110),
        (1546905600,187926,115),
        (1546992000,187926,115),
        (1547078400,187926,115),
        (1547942400,187927,100),
        (1548028800,187927,100),
        (1548115200,187927,100),
        (1548201600,187927,100),
        (1548288000,187927,100)
    ) as t (unixdatetime, gaugesummaryid, value)
    order by unixdatetime
    $$
    ) as final_result (
        unixdatetime int, 
        trace1 double precision, 
        trace2 double precision, 
        trace3 double precision, 
        trace4 double precision
        );

Here's the link in case you'd like to play around:

https://dbfiddle.uk/?rdbms=postgres_11&fiddle=2c4f6098fb89b78898ba1bf6afa7f439

How to get the desired result?

3条回答
走好不送
2楼-- · 2019-08-19 08:12

I would recommend you to use filter (where ...) clause instead of a pivot table.

select
    unixdatetime,
    min(value) filter (where gaugesummaryid = 187923) as trace_1,
    min(value) filter (where gaugesummaryid = 187924) as trace_2,
    min(value) filter (where gaugesummaryid = 187926) as trace_3,
    min(value) filter (where gaugesummaryid = 187927) as trace_4
from table
group by 1;

Note, that you have to use an aggregate function to be able to use the clause. In your case, it does not matter if you use min, max, avg or sum.

查看更多
啃猪蹄的小仙女
3楼-- · 2019-08-19 08:12

Use the 2-argument form of the crosstab function:

SELECT * 
FROM crosstab(
        $$
        SELECT 
                unixdatetime, 
                gaugesummaryid, 
                value::double precision 
        FROM test
        ORDER BY unixdatetime
        $$
        , 'SELECT DISTINCT gaugesummaryid FROM test ORDER BY 1 LIMIT 4'
        ) as final_result (
                unixdatetime int, 
                trace1 double precision, 
                trace2 double precision, 
                trace3 double precision, 
                trace4 double precision
                )

yields

| unixdatetime | trace1 | trace2 | trace3 | trace4 |
|--------------+--------+--------+--------+--------|
|   1546300800 |    1.5 |        |    120 |        |
|   1546387200 |    1.5 |        |    120 |        |
|   1546473600 |    1.5 |        |    120 |        |
|   1546560000 |   1.75 |        |    110 |        |
|   1546646400 |   1.75 |        |    110 |        |
|   1546732800 |   1.75 |        |    110 |        |
|   1546819200 |   1.75 |        |    110 |        |
|   1546905600 |    1.5 |        |    115 |        |
|   1546992000 |    1.5 |        |    115 |        |
|   1547078400 |    1.5 |        |    115 |        |
|   1547164800 |    1.5 |        |        |        |
|   1547337600 |        |    200 |        |        |
|   1547424000 |        |    200 |        |        |
|   1547510400 |        |    200 |        |        |
|   1547596800 |        |    200 |        |        |
|   1547683200 |        |    200 |        |        |
|   1547769600 |        |    200 |        |        |
|   1547856000 |        |    200 |        |        |
|   1547942400 |        |    200 |        |    100 |
|   1548028800 |        |    200 |        |    100 |
|   1548115200 |        |    200 |        |    100 |
|   1548201600 |        |    200 |        |    100 |
|   1548288000 |        |    200 |        |    100 |

Using this setup:

DROP TABLE IF EXISTS test;
CREATE TABLE test (
        unixdatetime bigint, 
        gaugesummaryid int, 
        value double precision 
);
INSERT INTO test VALUES
        (1546300800,187923,1.5),
        (1546387200,187923,1.5),
        (1546473600,187923,1.5),
        (1546560000,187923,1.75),
        (1546646400,187923,1.75),
        (1546732800,187923,1.75),
        (1546819200,187923,1.75),
        (1546905600,187923,1.5),
        (1546992000,187923,1.5),
        (1547078400,187923,1.5),
        (1547164800,187923,1.5),
        (1547337600,187924,200),
        (1547424000,187924,200),
        (1547510400,187924,200),
        (1547596800,187924,200),
        (1547683200,187924,200),
        (1547769600,187924,200),
        (1547856000,187924,200),
        (1547942400,187924,200),
        (1548028800,187924,200),
        (1548115200,187924,200),
        (1548201600,187924,200),
        (1548288000,187924,200),
        (1546300800,187926,120),
        (1546387200,187926,120),
        (1546473600,187926,120),
        (1546560000,187926,110),
        (1546646400,187926,110),
        (1546732800,187926,110),
        (1546819200,187926,110),
        (1546905600,187926,115),
        (1546992000,187926,115),
        (1547078400,187926,115),
        (1547942400,187927,100),
        (1548028800,187927,100),
        (1548115200,187927,100),
        (1548201600,187927,100),
        (1548288000,187927,100);
查看更多
我欲成王,谁敢阻挡
4楼-- · 2019-08-19 08:32

While some of the target values may be missing , you need the 2-argument form of crosstab() (like unutbu provided).
But it makes no sense to use a query producing unstable results as 2nd parameter. Use a VALUES expression (or similar) to provide a stable set of target columns in sync with the resulting column definition list. Like:

SELECT * 
FROM   crosstab(
   $$
   SELECT *
   FROM  (
      VALUES
      (bigint '1546300800', 187923, float8 '1.5')
    , (1546387200,187923,1.5)
    , (1546473600,187923,1.5)
 -- , ...
    , (1548288000,187927,100)
   ) t (unixdatetime, gaugesummaryid, value)
   ORDER BY 1,2
   $$
 , 'VALUES (187923), (187924), (187926), (187927)'    -- !!
   ) final_result (unixdatetime int
                 , trace1 float8
                 , trace2 float8
                 , trace3 float8
                 , trace4 float8);

db<>fiddle here

Detailed explanation:

It would be nice to get results for a dynamic number of target columns from a single query. Alas, SQL does not work like that. There are various workarounds. See:

查看更多
登录 后发表回答