Select the last value of each column, per user, th

2019-07-30 01:40发布

This is an extended version of a related previous question. I have posted it a new question for Erwin Brandstetter suggested me to do so. (I realized that I actually wanted this, after people replied to my first question)

Having the following data (blank means NULL):

ID    User  ColA    ColB    ColC
1     1     15              20
2     1     11      4       
3     1             3
4     2     5       5       10
5     2     6 
6     2             8
7     1             1

How can I get the last not-NULL values of each column for all users, the simplest way? So the resulting for the given data would be:

User  ColA    ColB    ColC
1     11      1       20
2     6       8       10

I have not found much, the function that seemed to do something similar to what I describe was COALESCE, but it does not work as expected in my case.

Note: Standard SQL if possible, PostgreSQL otherwise. The count of the involved columns might change, so a solution that is not tied to these three specific columns would be best.

标签: sql null
2条回答
爱情/是我丢掉的垃圾
2楼-- · 2019-07-30 02:18

This query is easy convert to MS SQL. If you need more something specific add comment. Mysql Query:

SQLFIDDLEExample

SELECT
t1.User,
(SELECT ColA 
           FROM Table1
           WHERE ColA is not null
           AND Table1.User = t1.User
           ORDER BY ID DESC
           LIMIT 1 ) as ColA,
(SELECT ColB 
           FROM Table1
           WHERE ColB is not null
           AND Table1.User = t1.User
           ORDER BY ID DESC
           LIMIT 1 ) as ColB,
(SELECT ColC 
           FROM Table1
           WHERE ColC is not null
           AND Table1.User = t1.User
           ORDER BY ID DESC
           LIMIT 1 ) as ColC
FROM Table1 t1
GROUP BY t1.User

Result:

| USER | COLA | COLB | COLC |
-----------------------------
|    1 |   11 |    1 |   20 |
|    2 |    6 |    8 |   10 |
查看更多
萌系小妹纸
3楼-- · 2019-07-30 02:31

"Standard" SQL

Similar to what I posted on the previous question, a recursive CTE is elegant and probably the fastest way to do it in standard SQL - especially for many rows per user.

WITH RECURSIVE t AS (
   SELECT row_number() OVER (PARTITION BY usr ORDER  BY id DESC) AS rn
         ,usr, cola, colb, colc
   FROM   tbl
   )

   , x AS (
   SELECT rn, usr, cola, colb, colc
   FROM   t
   WHERE  rn = 1

   UNION ALL
   SELECT t.rn, t.usr
        , COALESCE(x.cola, t.cola)
        , COALESCE(x.colb, t.colb)
        , COALESCE(x.colc, t.colc)
   FROM   x
   JOIN   t USING (usr)
   WHERE  t.rn = x.rn + 1
   AND    (x.cola IS NULL OR x.colb IS NULL OR x.colc IS NULL)
   )
SELECT DISTINCT ON (usr)
       usr, cola, colb, colc
FROM   x
ORDER  BY usr, rn DESC;

-> sqlfiddle for requested PostgreSQL.

The only non-standard element is DISTINCT ON, which is an extension to DISTINCT in the standard. Replace the final SELECT with this for a standard SQL:

SELECT usr
      ,max(cola) As cola
      ,max(colb) As colb
      ,max(colc) As colc
FROM   x
GROUP  BY usr
ORDER  BY usr;

The request for "standard SQL" is of limited use. The standard only exists on paper. No RDBMS implements 100 % standard SQL - it would be rather pointless, too, since the standard includes nonsensical parts here and there. Arguably, PostgreSQL's implementation is among the closest to the standard.

PL/pgSQL function

This solution is specific to PostgreSQL, but should perform very well.

I am building on the same table as demonstrated in the fiddle above.

CREATE OR REPLACE FUNCTION f_last_nonull_per_user()
RETURNS SETOF tbl AS
$func$
DECLARE
   _row tbl;  -- table name can be used as row type
   _new tbl;
BEGIN

FOR _new IN
   SELECT * FROM tbl ORDER BY usr, id DESC
LOOP
   IF _new.usr = _row.usr THEN 
      _row.id := _new.id;   -- copy only id
      IF _row.cola IS NULL AND _new.cola IS NOT NULL THEN
         _row.cola := _new.cola; END IF;   -- only if no value found yet
      IF _row.colb IS NULL AND _new.colb IS NOT NULL THEN
         _row.colb := _new.colb; END IF;
      IF _row.colc IS NULL AND _new.colc IS NOT NULL THEN
         _row.colc := _new.colc; END IF;
   ELSE
      IF _new.usr <> _row.usr THEN  -- doesn't fire on first row
         RETURN NEXT _row;
      END IF;   
      _row := _new;  -- remember row for next iteration
   END IF;
END LOOP;

RETURN NEXT _row;  -- return row for last usr

END
$func$ LANGUAGE plpgsql;

Call:

SELECT * FROM f_last_nonull_per_user();

Returns the whole row - including the min id we need to fill all columns.

查看更多
登录 后发表回答