Loop on tables with PL/pgSQL in Postgres 9.0+

2019-01-17 04:55发布

问题:

I want to loop through all my tables to count rows in each of them. The following query gets me an error:

DO $$
DECLARE
    tables CURSOR FOR
        SELECT tablename FROM pg_tables
        WHERE tablename NOT LIKE 'pg_%'
        ORDER BY tablename;
    tablename varchar(100);
    nbRow int;
BEGIN
    FOR tablename IN tables LOOP
        EXECUTE 'SELECT count(*) FROM ' || tablename INTO nbRow;
        -- Do something with nbRow
    END LOOP;
END$$;

Errors:

ERROR:  syntax error at or near ")"
LINE 1: SELECT count(*) FROM (sql_features)
                                          ^
QUERY:  SELECT count(*) FROM (sql_features)
CONTEXT:  PL/pgSQL function inline_code_block line 8 at EXECUTE statement

sql_features is a table's name in my DB. I already tried to use quote_ident() but to no avail.

回答1:

The cursor returns a record, not a scalar value, so "tablename" is not a string variable.

The concatenation turns the record into a string that looks like this (sql_features). If you had selected e.g. the schemaname with the tablename, the text representation of the record would have been (public,sql_features).

So you need to access the column inside the record to create your SQL statement:

DO $$
DECLARE
    tables CURSOR FOR
        SELECT tablename
        FROM pg_tables
        WHERE tablename NOT LIKE 'pg_%'
        ORDER BY tablename;
    nbRow int;
BEGIN
    FOR table_record IN tables LOOP
        EXECUTE 'SELECT count(*) FROM ' || table_record.tablename INTO nbRow;
        -- Do something with nbRow
    END LOOP;
END$$;

You might want to use WHERE schemaname = 'public' instead of not like 'pg_%' to exclude the Postgres system tables.



回答2:

I can't remember the last time I actually needed to use an explicit cursor for looping in plpgsql.
Use the implicit cursor of a FOR loop, that's much cleaner:

DO
$$
DECLARE
    rec   record;
    nbrow bigint;
BEGIN
   FOR rec IN
      SELECT *
      FROM   pg_tables
      WHERE  tablename NOT LIKE 'pg\_%'
      ORDER  BY tablename
   LOOP
      EXECUTE 'SELECT count(*) FROM '
        || quote_ident(rec.schemaname) || '.'
        || quote_ident(rec.tablename)
      INTO nbrow;
      -- Do something with nbrow
   END LOOP;
END
$$;

You need to include the schema name to make this work for all schemas (including those not in your search_path).

Also, you actually need to use quote_ident() or format() with %I to safeguard against SQL injection. A table name can be almost anything inside double quotes.

Minor detail: escape the underscore (_) in the LIKE pattern to make it a literal underscore: tablename NOT LIKE 'pg\_%'

How I might do it:

DO
$$
DECLARE
    tbl   regclass;
    nbrow bigint;
BEGIN
   FOR tbl IN
      SELECT c.oid
      FROM   pg_class     c
      JOIN   pg_namespace n ON n.oid = c.relnamespace
      WHERE  c.relkind = 'r'
      AND    n.nspname NOT LIKE 'pg\_%'         -- system schema(s)
      AND    n.nspname <> 'information_schema'  -- information schema
      ORDER  BY n.nspname, c.relname
   LOOP
      EXECUTE 'SELECT count(*) FROM ' || tbl INTO nbrow;
      -- raise notice '%: % rows', tbl, nbrow;
   END LOOP;
END
$$;
  • Query pg_catalog.pg_class instead of tablename, it provides the OID of the table.

  • The object identifier type regclass is handy to simplify, in particular, table names are double-quoted and schema-qualified where necessary automatically (also prevents SQL injection).

  • This query also excludes temporary tables (temp schema is named with pg_temp% internally).

  • If you want only tables from a given schema:

    AND    n.nspname = 'public' -- schema name here, case-sensitive