Given a structure like this:
CREATE TABLE reference_table (
reference_table_key numeric NOT NULL,
reference_value numeric,
CONSTRAINT reference_table_pk PRIMARY KEY (reference_table_key)
);
CREATE TABLE other_table (
other_table_key numeric NOT NULL,
reference_table_key numeric,
CONSTRAINT other_table_pk PRIMARY KEY (other_table_key),
ONSTRAINT other_table_reference_fk FOREIGN KEY (reference_table_key)
REFERENCES reference_table (reference_table_key) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE SET NULL
);
CREATE TABLE another_table (
another_table_key numeric NOT NULL,
do_stuff_key numeric,
CONSTRAINT another_table_pk PRIMARY KEY (another_table_key),
ONSTRAINT another_table_reference_fk FOREIGN KEY (do_stuff_key)
REFERENCES reference_table (reference_table_key) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE SET NULL
);
--there are 50-60 tables which have similar foreign key references to reference_table
I want to write a query that tells me the primary keys in other_table and another_table and potentially more tables where reference_value is NULL.
In psuedo-code:
SELECT table_name, table_primary_key, table_fk_column_name
FROM ?????? some PG table ???????, reference_table
WHERE reference_table.reference_value IS NULL;
The result would look something like:
table_name | table_primary_key | table_fk_column_name | reference_table_pk
---------------------------------------------------------------------------
other_table | 2 | reference_table_key | 7
other_table | 4 | reference_table_key | 56
other_table | 45 | reference_table_key | 454
other_table | 65765 | reference_table_key | 987987
other_table | 11 | reference_table_key | 3213
another_table | 3 | do_stuff_key | 4645
another_table | 5 | do_stuff_key | 43546
another_table | 7 | do_stuff_key | 464356
unknown_table | 1 | unkown_column_key | 435435
unknown_table | 1 | some_other_column_key | 34543
unknown_table | 3 | unkown_column_key | 124
unknown_table | 3 | some_other_column_key | 123
This is similar to, but not a duplicate of Postgres: SQL to list table foreign keys . That question shows the table structure. I want to find specific instances.
Essentially if I were to DELETE FROM reference_table WHERE reference_value IS NULL;
, postgres has to do something internally to figure out that it needs to set reference_table_key
in row 2 in other_table
to NULL. I want to see what those rows would be.
Is there a query that can do this? Is there a modifier that I can pass to a DELETE call that would tell me what tables/rows/columns would be affected by that DELETE?
NULL values in referencing columns
This query produces the DML statement to find all rows in all tables, where a column has a foreign-key constraint referencing another table but hold a NULL
value in that column:
WITH x AS (
SELECT c.conrelid::regclass AS tbl
, c.confrelid::regclass AS ftbl
, quote_ident(k.attname) AS fk
, quote_ident(pf.attname) AS pk
FROM pg_constraint c
JOIN pg_attribute k ON (k.attrelid, k.attnum) = (c.conrelid, c.conkey[1])
JOIN pg_attribute f ON (f.attrelid, f.attnum) = (c.confrelid, c.confkey[1])
LEFT JOIN pg_constraint p ON p.conrelid = c.conrelid AND p.contype = 'p'
LEFT JOIN pg_attribute pf ON (pf.attrelid, pf.attnum)
= (p.conrelid, p.conkey[1])
WHERE c.contype = 'f'
AND c.confrelid = 'fk_tbl'::regclass -- references to this tbl
AND f.attname = 'fk_tbl_id' -- and only to this column
)
SELECT string_agg(format(
'SELECT %L AS tbl
, %L AS pk
, %s::text AS pk_val
, %L AS fk
, %L AS ftbl
FROM %1$s WHERE %4$s IS NULL'
, tbl
, COALESCE(pk 'NONE')
, COALESCE(pk 'NULL')
, fk
, ftbl), '
UNION ALL
') || ';'
FROM x;
Produces a query like this:
SELECT 'some_tbl' AS tbl
, 'some_tbl_id' AS pk
, some_tbl_id::text AS pk_val
, 'fk_tbl_id' AS fk
, 'fk_tbl' AS ftbl
FROM some_tbl WHERE fk_tbl_id IS NULL
UNION ALL
SELECT 'other_tbl' AS tbl
, 'other_tbl_id' AS pk
, other_tbl_id::text AS pk_val
, 'some_name_id' AS fk
, 'fk_tbl' AS ftbl
FROM other_tbl WHERE some_name_id IS NULL;
Produces output like this:
tbl | pk | pk_val | fk | ftbl
-----------+--------------+--------+--------------+--------
some_tbl | some_tbl_id | 49 | fk_tbl_id | fk_tbl
some_tbl | some_tbl_id | 58 | fk_tbl_id | fk_tbl
other_tbl | other_tbl_id | 66 | some_name_id | fk_tbl
other_tbl | other_tbl_id | 67 | some_name_id | fk_tbl
Does not cover multi-column foreign or primary keys reliably. You have to make the query more complex for this.
I cast all primary key values to text
to cover all types.
Adapt or remove these lines to find foreign key pointing to an other or any column / table:
AND c.confrelid = 'fk_tbl'::regclass
AND f.attname = 'fk_tbl_id' -- and only this column
Tested with PostgreSQL 9.1.4. I use the pg_catalog
tables. Realistically nothing of what I use here is going to change, but that is not guaranteed across major releases. Rewrite it with tables from information_schema
if you need it to work reliably across updates. That is slower, but sure.
I did not sanitize table names in the generated DML script, because quote_ident()
would fail with schema-qualified names. It is your responsibility to avoid harmful table names like "users; DELETE * FROM users;"
. With some more effort, you can retrieve schema-name and table name separately and use quote_ident()
.
NULL values in referenced columns
My first solution does something subtly different from what you ask, because what you describe (as I understand it) is non-existent. The value NULL
is "unknown" and cannot be referenced. If you actually want to find rows with a NULL
value in a column that has FK constraints pointing to it (not to the particular row with the NULL
value, of course), then the query can be much simplified:
WITH x AS (
SELECT c.confrelid::regclass AS ftbl
,quote_ident(f.attname) AS fk
,quote_ident(pf.attname) AS pk
,string_agg(c.conrelid::regclass::text, ', ') AS referencing_tbls
FROM pg_constraint c
JOIN pg_attribute f ON (f.attrelid, f.attnum) = (c.confrelid, c.confkey[1])
LEFT JOIN pg_constraint p ON p.conrelid = c.confrelid AND p.contype = 'p'
LEFT JOIN pg_attribute pf ON (pf.attrelid, pf.attnum)
= (p.conrelid, p.conkey[1])
WHERE c.contype = 'f'
-- AND c.confrelid = 'fk_tbl'::regclass -- only referring this tbl
GROUP BY 1, 2, 3
)
SELECT string_agg(format(
'SELECT %L AS ftbl
, %L AS pk
, %s::text AS pk_val
, %L AS fk
, %L AS referencing_tbls
FROM %1$s WHERE %4$s IS NULL'
, ftbl
, COALESCE(pk, 'NONE')
, COALESCE(pk, 'NULL')
, fk
, referencing_tbls), '
UNION ALL
') || ';'
FROM x;
Finds all such rows in the entire database (commented out the restriction to one table). Tested with Postgres 9.1.4 and works for me.
I group multiple tables referencing the same foreign column into one query and add a list of referencing tables to give an overview.
You want a union for this query:
select *
from ((select 'other_table' as table_name,
other_table_key as primary_key,
'reference_table_key' as table_fk,
ot.reference_table_key
from other_table ot left outer join
reference_table rt
on ot.reference_table_key = rt.reference_table_key
where rt.reference_value is null
) union all
(select 'another_table' as table_name,
another_table_key as primary_key,
'do_stuff_key' as table_fk,
at.do_stuff_key
from another_table at left outer join
reference_table rt
on at.do_stuff_key = rt.reference_table_key
where rt.reference_value is null
)
) t