Having the following data (blank means NULL):
ID ColA ColB ColC
1 15 20
2 11 4
3 3
How can I get the last not-NULL values of each column in a single query? So the resulting for the given data would be:
ColA ColB ColC
11 3 20
I have not found much, the function that seemed to do something similar to what I describe was COALESCE
, but it does not work as expected in my case.
Looks like you would have to run a separate query per column with plain SQL. For a small table and only 3 columns, @Guffa's query should be fine.
3 window functions
You can do the same in one query with three window functions: Not sure if this is faster than three individual subqueries:
SELECT first_value(cola) OVER (ORDER BY cola IS NULL, id DESC) AS cola
,first_value(colb) OVER (ORDER BY colb IS NULL, id DESC) AS colb
,first_value(colc) OVER (ORDER BY colc IS NULL, id DESC) AS colc
FROM tbl
LIMIT 1;
count()
as window function
You can also exploit the fact that count()
does not count NULL
values.
WITH x AS (
SELECT CASE WHEN count(cola) OVER w = 1 THEN cola ELSE NULL END AS cola
,CASE WHEN count(colb) OVER w = 1 THEN colb ELSE NULL END AS colb
,CASE WHEN count(colc) OVER w = 1 THEN colc ELSE NULL END AS colc
FROM tbl
-- WHERE id > x -- safe to ignore a certain portion from a large table?
WINDOW w AS (ORDER BY id DESC)
)
SELECT max(cola) AS cola, max(colb) AS colb, max(colc) AS colc
FROM x
For bigger tables and more columns, a recursive CTE or a procedural function will be considerably faster:
Recursive CTE
WITH RECURSIVE x AS (
SELECT cola, colb, colc
,row_number() OVER (ORDER BY id DESC) AS rn
FROM tbl
)
, y AS (
SELECT rn, cola, colb, colc
FROM x
WHERE rn = 1
UNION ALL
SELECT x.rn
, COALESCE(y.cola,x.cola)
, COALESCE(y.colb,x.colb)
, COALESCE(y.colc,x.colc)
FROM y
JOIN x ON x.rn = y.rn + 1
WHERE y.cola IS NULL OR y.colb IS NULL OR y.colc IS NULL
)
SELECT cola, colb, colc
FROM y
ORDER BY rn DESC
LIMIT 1;
PL/pgSQL function
My money is on this one for best performance:
CREATE OR REPLACE FUNCTION f_last_nonull(OUT cola int
, OUT colb int
, OUT colc int) AS
$func$
DECLARE
r record;
BEGIN
FOR r IN
SELECT t.cola, t.colb, t.colc
FROM tbl t
ORDER BY t.id DESC
LOOP
IF cola IS NULL AND r.cola IS NOT NULL THEN cola := r.cola; END IF;
IF colb IS NULL AND r.colb IS NOT NULL THEN colb := r.colb; END IF;
IF colc IS NULL AND r.colc IS NOT NULL THEN colc := r.colc; END IF;
EXIT WHEN NOT (cola IS NULL OR colb IS NULL OR colc IS NULL);
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM f_last_nonull();
cola | colb | colc
-----+------+------
11 | 3 | 20
Test with EXPLAIN ANALYZE
. Would be nice if you could come back with a comparison of the solutions.
You can use subqueries:
select
(select ColA from TheTable where ColA is not null order by ID desc limit 1) as ColA,
(select ColB from TheTable where ColB is not null order by ID desc limit 1) as ColB,
(select ColC from TheTable where ColC is not null order by ID desc limit 1) as Col