SQL fill null values for a field in multiple rows

2019-05-26 17:35发布

问题:

I have a table in netezza (based on postgresql) like below. I need to create a new table with the NULL values for name to be replaced with value of name for the previous non-null row.

table1

id name time    value
---------------------
1 john  11:00   324
2 NULL  12:00   645
3 NULL  13:00   324
4 bane  11:00   132
5 NULL  12:00   30
6 NULL  13:00   NULL
7 NULL  14:00   -1
8 zane  11:00   152
9 NULL  12:00   60
10 NULL 13:00   NULL

output table

name    time    value
---------------------
john    11:00   324
john    12:00   645
john    13:00   324
bane    11:00   132
bane    12:00   30
bane    13:00   NULL
bane    14:00   -1
zane    11:00   152
zane    12:00   60
zane    13:00   NULL
notes:
  1. Cannot alter table1 due to permission restrictions, so a new table is the way.

  2. Need to run this in Netezza (preferably) or MS Access.

Code used to create the test data in Netezza is as below.

create temp table test (
    id int
    ,name varchar(10)
    ,time time
    ,value int
)distribute on random;

insert into test (id, name, time, value) values(1, 'joe', '10:00', 324);
insert into test (id, name, time, value) values(2, null, '11:00', 645);
insert into test (id, name, time, value) values(3, null, '12:00', 324);

insert into test (id, name, time, value) values(4, 'bane', '10:00', 132);
insert into test (id, name, time, value) values(5, null, '11:00', 30);
insert into test (id, name, time, value) values(6, null, '12:00', null);
insert into test (id, name, time, value) values(7, null, '13:00', -1);

insert into test (id, name, time, value) values(8, 'zane', '10:00', 152);
insert into test (id, name, time, value) values(9, null, '11:00', 60);
insert into test (id, name, time, value) values(10, null, '12:00', null);

回答1:

Try this recursive PostgreSQL query:

WITH RECURSIVE t(id, name, time, value) AS (
  SELECT id, name, time, value FROM test WHERE id = (
    SELECT MIN(id) FROM test
  )
  UNION
  SELECT test.id, COALESCE(test.name, t.name), test.time, test.value
  FROM test, t WHERE test.id = (
    SELECT id FROM test WHERE id > t.id ORDER BY id LIMIT 1
  )
) SELECT * FROM t ORDER BY id;

Notice though that this will probably issue a SELECT on each row. If you don't want that, you can use a view solution:

CREATE VIEW test_view AS
SELECT id, LAG(id) OVER (ORDER BY id) lag_id, name, time, value FROM test;

WITH RECURSIVE t(id, name, time, value) AS (
  SELECT id, name, time, value FROM test_view WHERE lag_id IS NULL
  UNION ALL
  SELECT test_view.id, COALESCE(test_view.name, t.name),
    test_view.time, test_view.value
  FROM test_view, t WHERE test_view.lag_id = t.id
) SELECT * FROM t ORDER BY id;

This should be much faster. The idea is from this article. SQLFiddle: http://sqlfiddle.com/#!15/63f7b/1/1.



回答2:

This works for me in Access 2010:

SELECT 
    t1.id,
    (
        SELECT TOP 1 t2.name
        FROM test t2
        WHERE t2.id<=t1.id AND t2.name IS NOT NULL
        ORDER BY t2.id DESC
    ) AS name,
    t1.time,
    t1.value
FROM test t1

It should work in other SQL dialects as well, although they may have a slightly different way of doing TOP 1 (e.g., LIMIT 1 is a common variant).



回答3:

You can achieve this with COALESCE function and subqueries:

SELECT t.id, 
    COALESCE(t.name, (SELECT s.name FROM table s WHERE s.name IS NOT NULL AND s.id < t.id ORDER BY s.id LIMIT 1)) AS name,
    t.time,
    COALESCE(t.value, (SELECT s.value FROM table s WHERE s.value IS NOT NULL AND s.id < t.id ORDER BY s.id LIMIT 1)) AS value
FROM table t ORDER BY t.id


回答4:

MS Access based solution

SELECT (SELECT last(name)
          FROM test AS temp
          WHERE test.id >= temp.id AND temp.name IS NOT NULL) AS new_name, *
FROM test;


回答5:

Try using the lead function. Not sure if that works with postgre, for Oracle it does. I think that can help.