PostgreSQL - dynamic INSERT on column names

2019-08-20 19:37发布

问题:

I'm looking to dynamically insert a set of columns from one table to another in PostgreSQL. What I think I'd like to do is read in a 'checklist' of column headings (those columns which exist in table 1 - the storage table), and if they exist in the export table (table 2) then insert them in all at once from table 1. Table 2 will be variable in its columns though - once imported ill drop it and import new data to be imported with potentially different column structure. So I need to import it based on the column names.

e.g.

Table 1. - The storage table

ID     NAME     YEAR     LITH_AGE    PROV_AGE    SIO2    TIO2    CAO    MGO   COMMENTS
1      John     1998     2000        3000        65      10      5      5     comment1
2      Mark     2005     2444        3444        63      8       2      3     comment2
3      Luke     2001     1000        1500        77      10      2      2     comment3

Table 2. - The export table

ID     NAME     MG#    METHOD    SIO2    TIO2    CAO    MGO
1      Amy      4      Method1   65      10      5      5    
2      Poe      3      Method2   63      8       2      3   
3      Ben      2      Method3   77      10      2      2     

As you can see the export table may include columns which do not exist in the storage table, so these would be ignored.

I want to insert all of these columns at once, as I've found if I do it individually by column it extends the number of rows each time on the insert (maybe someone can solve this issue instead? Currently I've written a function to check if a column name exists in table 2, if it does, insert it, but as said this extends the rows of the table every time and NULL the rest of the columns). The INSERT line from my function:

EXECUTE format('INSERT INTO %s (%s) (SELECT %s::%s FROM %s);',_tbl_import, _col,_col,_type,_tbl_export);

As a type of 'code example' for my question:

EXECUTE FORMAT('INSERT INTO table1 (%s) (SELECT (%s) FROM table2)',columns)

where 'columns' would be some variable denoting the columns that exist in the export table that need to go into the storage table. This will be variable as table 2 will be different every time.

This would ideally update Table 1 as:

ID     NAME     YEAR     LITH_AGE    PROV_AGE    SIO2    TIO2    CAO    MGO   COMMENTS
1      John     1998     2000        3000        65      10      5      5     comment1
2      Mark     2005     2444        3444        63      8       2      3     comment2
3      Luke     2001     1000        1500        77      10      2      2     comment3
4      Amy      NULL     NULL        NULL        65      10      5      5     NULL
5      Poe      NULL     NULL        NULL        63      8       2      3     NULL   
6      Ben      NULL     NULL        NULL        77      10      2      2     NULL  

回答1:

UPDATED answer

As my original answer did not meet requirement came out later but was asked to post an alternative example for information_schema solution so here it is.

I made two versions for solutions:

V1 - is equivalent to already given example using information_schema. But that solution relies on table1 column DEFAULTs. Meaning, if table1 column that does not exist at table2 does not have DEFAULT NULL then it will be filled with whatever the default is.

V2 - is modified to force 'NULL' in case of two table columns mismatch and does not inherit table1 own DEFAULTs

Version1:

CREATE OR REPLACE FUNCTION insert_into_table1_v1()
RETURNS void AS $main$

DECLARE
    columns text;

BEGIN

    SELECT  string_agg(c1.attname, ',')
    INTO    columns
    FROM    pg_attribute c1
    JOIN    pg_attribute c2
    ON      c1.attrelid = 'public.table1'::regclass
    AND     c2.attrelid = 'public.table2'::regclass
    AND     c1.attnum > 0
    AND     c2.attnum > 0
    AND     NOT c1.attisdropped
    AND     NOT c2.attisdropped
    AND     c1.attname = c2.attname
    AND     c1.attname <> 'id';

    --       Following is the actual result of query above, based on given data examples:
    --       -[ RECORD 1 ]----------------------
    --       string_agg | name,si02,ti02,cao,mgo

    EXECUTE format(
        '   INSERT INTO table1 ( %1$s )
            SELECT %1$s
            FROM table2
        ',
        columns
    );

END;
$main$ LANGUAGE plpgsql;

Version2:

CREATE OR REPLACE FUNCTION insert_into_table1_v2()
RETURNS void AS $main$

DECLARE
    t1_cols text;
    t2_cols text;

BEGIN

    SELECT  string_agg( c1.attname, ',' ),
            string_agg( COALESCE( c2.attname, 'NULL' ), ',' )
    INTO    t1_cols,
            t2_cols
    FROM    pg_attribute c1
    LEFT JOIN    pg_attribute c2
    ON      c2.attrelid = 'public.table2'::regclass
    AND     c2.attnum > 0
    AND     NOT c2.attisdropped
    AND     c1.attname = c2.attname
    WHERE   c1.attrelid = 'public.table1'::regclass
    AND     c1.attnum > 0
    AND     NOT c1.attisdropped
    AND     c1.attname <> 'id';

    --       Following is the actual result of query above, based on given data examples:
    --                               t1_cols                         |                  t2_cols
    --       --------------------------------------------------------+--------------------------------------------
    --        name,year,lith_age,prov_age,si02,ti02,cao,mgo,comments | name,NULL,NULL,NULL,si02,ti02,cao,mgo,NULL
    --       (1 row)

    EXECUTE format(
        '   INSERT INTO table1 ( %s )
            SELECT %s
            FROM table2
        ',
        t1_cols,
        t2_cols
    );

END;
$main$ LANGUAGE plpgsql;

Also link to documentation about pg_attribute table columns if something is unclear: https://www.postgresql.org/docs/current/static/catalog-pg-attribute.html

Hopefully this helps :)