Self-managing PostgreSQL partition tables

2019-02-10 23:37发布

I am trying to make a self-managing partition table setup with Postgres. It all revolves around this function but I can't seem to get Postgres to accept my table names. Any ideas or examples of self-managing partition table trigger functions?

My current function:

DECLARE
day integer;
year integer;
tablename text;
startdate text;
enddate text;
BEGIN
day:=date_part('doy',to_timestamp(NEW.date));
year:=date_part('year',to_timestamp(NEW.date));
tablename:='pings_'||year||'_'||day||'_'||NEW.id;
-- RAISE EXCEPTION 'tablename=%',tablename;
PERFORM 'tablename' FROM pg_tables WHERE 'schemaname'=tablename;
-- RAISE EXCEPTION 'found=%',FOUND;
IF FOUND <> TRUE THEN
    startdate:=date_part('year',to_timestamp(NEW.date))||'-'||date_part('month',to_timestamp(NEW.date))||'-'||date_part('day',to_timestamp(NEW.date));
    enddate:=startdate::timestamp + INTERVAL '1 day';
    EXECUTE 'CREATE TABLE $1 (
        CHECK ( date >= DATE $2 AND date < DATE $3 )
    ) INHERITS (pings)' USING quote_ident(tablename),startdate,enddate;
END IF;
EXECUTE 'INSERT INTO $1 VALUES (NEW.*)' USING quote_ident(tablename);
RETURN NULL;
END;

I want it to auto-create a table called pings_YEAR_DOY_ID but it always fails with:

2011-10-24 13:39:04 CDT [15804]: [1-1] ERROR:  invalid input syntax for type double precision: "-" at character 45
2011-10-24 13:39:04 CDT [15804]: [2-1] QUERY:  SELECT date_part('year',to_timestamp( $1 ))+'-'+date_part('month',to_timestamp( $2 ))+'-'+date_part('day',to_timestamp( $3 ))
2011-10-24 13:39:04 CDT [15804]: [3-1] CONTEXT:  PL/pgSQL function "ping_partition" line 15 at assignment
2011-10-24 13:39:04 CDT [15804]: [4-1] STATEMENT:  INSERT INTO pings VALUES (0,0,5);

TRY 2

After applying the changes and modifying it some more (date is a unixtimestamp column, my thinking being that an integer column is faster than a timestamp column when selecting). I get the below error, not sure if I am using the proper syntax for USING NEW?

Updated function:

CREATE FUNCTION ping_partition() RETURNS trigger
    LANGUAGE plpgsql
    AS $_$DECLARE
day integer;
year integer;
tablename text;
startdate text;
enddate text;
BEGIN
day:=date_part('doy',to_timestamp(NEW.date));
year:=date_part('year',to_timestamp(NEW.date));
tablename:='pings_'||year||'_'||day||'_'||NEW.id;
-- RAISE EXCEPTION 'tablename=%',tablename;
PERFORM 'tablename' FROM pg_tables WHERE 'schemaname'=tablename;
-- RAISE EXCEPTION 'found=%',FOUND;
IF FOUND <> TRUE THEN
    startdate := to_char(to_timestamp(NEW.date), 'YYYY-MM-DD');
    enddate:=startdate::timestamp + INTERVAL '1 day';
    EXECUTE 'CREATE TABLE ' || quote_ident(tablename) || ' (
        CHECK ( date >= EXTRACT(EPOCH FROM DATE ' || quote_literal(startdate) || ')
            AND date < EXTRACT(EPOCH FROM DATE ' || quote_literal(enddate) || ') )
    ) INHERITS (pings)';
END IF;
EXECUTE 'INSERT INTO ' || quote_ident(tablename) || ' SELECT $1' USING NEW; 
RETURN NULL;
END;
$_$;

My statement:

INSERT INTO pings VALUES (0,0,5);

SQL error:

ERROR:  column "date" is of type integer but expression is of type pings
LINE 1: INSERT INTO pings_1969_365_0 SELECT $1
                                            ^
HINT:  You will need to rewrite or cast the expression.
QUERY:  INSERT INTO pings_1969_365_0 SELECT $1
CONTEXT:  PL/pgSQL function "ping_partition" line 22 at EXECUTE statement

4条回答
贼婆χ
2楼-- · 2019-02-10 23:50

Dynamic partitioning in PostgreSQL is just a bad idea. Your code is not safe in a multi-user environment. For it to be safe you would have to use locks, which slows down execution. The optimal number of partitions is about one hundred. You can easily create that many well in advance to dramatically simplify the logic necessary for partitioning.

查看更多
smile是对你的礼貌
3楼-- · 2019-02-10 23:56

You need to put your date literals in single quotes. Currently you are executing something like this:

 CHECK ( date >= DATE 2011-10-25 AND date < DATE 2011-11-25 )

which is invalid. In this case 2011-10-25 is interpreted as 2011 minus 10 minus 25

Your code needs to create the SQL using single quotes around the date literal:

CHECK ( date >= DATE '2011-10-25' AND date < DATE '2011-11-25' )
查看更多
啃猪蹄的小仙女
4楼-- · 2019-02-11 00:07

You are mixing double precision output of date_part() with text '-'. That doesn't make sense to PostgreSQL. You would need an explicit cast to text. But there is a much simpler way to do all of this:

startdate:=date_part('year',to_timestamp(NEW.date))
||'-'||date_part('month',to_timestamp(NEW.date))
||'-'||date_part('day',to_timestamp(NEW.date));

Use instead:

startdate := to_char(NEW.date, 'YYYY-MM-DD');

This makes no sense either:

EXECUTE 'CREATE TABLE $1 (
        CHECK (date >= DATE $2 AND date < DATE $3 )
    ) INHERITS (pings)' USING quote_ident(tablename),startdate,enddate;

You can only supply values with the USING clause. Read the manual here. Try instead:

EXECUTE 'CREATE TABLE ' || quote_ident(tablename) || ' (
            CHECK ("date" >= ''' || startdate || ''' AND
                   "date" <  ''' || enddate   || '''))
            INHERITS (ping)';

Or better yet, use format(). See below.

Also, like @a_horse answered: You need to put your text values in single quotes.

Similar here:

EXECUTE 'INSERT INTO $1 VALUES (NEW.*)' USING quote_ident(tablename);

Instead:

EXECUTE 'INSERT INTO ' || quote_ident(tablename) || ' VALUES ($1.*)'
USING NEW;

Related answer:

Aside: While "date" is allowed for a column name in PostgreSQL it is a reserved word in every SQL standard. Don't name your column "date", it leads to confusing syntax errors.

Complete working demo

CREATE TABLE ping (ping_id integer, the_date date);

CREATE OR REPLACE FUNCTION trg_ping_partition()
  RETURNS trigger AS
$func$
DECLARE
   _tbl text := to_char(NEW.the_date, '"ping_"YYYY_DDD_') || NEW.ping_id;
BEGIN
   IF NOT EXISTS (
      SELECT 1
      FROM   pg_catalog.pg_class c
      JOIN   pg_catalog.pg_namespace n ON n.oid = c.relnamespace
      WHERE  n.nspname = 'public'  -- your schema
      AND    c.relname = _tbl
      AND    c.relkind = 'r') THEN

      EXECUTE format('CREATE TABLE %I (CHECK (the_date >= %L AND
                                              the_date <  %L)) INHERITS (ping)'
              , _tbl
              , to_char(NEW.the_date,     'YYYY-MM-DD')
              , to_char(NEW.the_date + 1, 'YYYY-MM-DD')
              );
   END IF;

   EXECUTE 'INSERT INTO ' || quote_ident(_tbl) || ' VALUES ($1.*)'
   USING NEW; 

   RETURN NULL;
END
$func$ LANGUAGE plpgsql SET search_path = public;

CREATE TRIGGER insbef
BEFORE INSERT ON ping
FOR EACH ROW EXECUTE PROCEDURE trg_ping_partition();

Tests:

INSERT INTO ping VALUES (1, now()::date);
INSERT INTO ping VALUES (2, now()::date);
INSERT INTO ping VALUES (2, now()::date + 1);
INSERT INTO ping VALUES (2, now()::date + 1);

SQL Fiddle.

查看更多
ゆ 、 Hurt°
5楼-- · 2019-02-11 00:10

I figured out the entirety and it works great, even have an auto-delete after 30 days. I hope this helps out future people looking for an autopartition trigger function.

CREATE FUNCTION ping_partition() RETURNS trigger
    LANGUAGE plpgsql
    AS $_$
DECLARE
_keepdate text;
_tablename text;
_startdate text;
_enddate text;
_result record;
BEGIN
_keepdate:=to_char(to_timestamp(NEW.date) - interval '30 days', 'YYYY-MM-DD');
_startdate := to_char(to_timestamp(NEW.date), 'YYYY-MM-DD');
_tablename:='pings_'||NEW.id||'_'||_startdate;
PERFORM 1
FROM   pg_catalog.pg_class c
JOIN   pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE  c.relkind = 'r'
AND    c.relname = _tablename
AND    n.nspname = 'pinglog';
IF NOT FOUND THEN
    _enddate:=_startdate::timestamp + INTERVAL '1 day';
    EXECUTE 'CREATE TABLE pinglog.' || quote_ident(_tablename) || ' (
        CHECK ( date >= EXTRACT(EPOCH FROM DATE ' || quote_literal(_startdate) || ')
            AND date < EXTRACT(EPOCH FROM DATE ' || quote_literal(_enddate) || ')
            AND id = ' || quote_literal(NEW.id) || '
        )
    ) INHERITS (pinglog.pings)';
    EXECUTE 'CREATE INDEX ' || quote_ident(_tablename||'_indx1') || ' ON pinglog.' || quote_ident(_tablename) || ' USING btree (microseconds) WHERE microseconds IS NULL';
    EXECUTE 'CREATE INDEX ' || quote_ident(_tablename||'_indx2') || ' ON pinglog.' || quote_ident(_tablename) || ' USING btree (date, id)';
    EXECUTE 'CREATE INDEX ' || quote_ident(_tablename||'_indx3') || ' ON pinglog.' || quote_ident(_tablename) || ' USING btree (date, id, microseconds) WHERE microseconds IS NULL';
END IF;
EXECUTE 'INSERT INTO ' || quote_ident(_tablename) || ' VALUES ($1.*)' USING NEW;
FOR _result IN SELECT * FROM pg_tables WHERE schemaname='pinglog' LOOP
    IF char_length(substring(_result.tablename from '[0-9-]*$')) <> 0 AND (to_timestamp(NEW.date) - interval '30 days') > to_timestamp(substring(_result.tablename from '[0-9-]*$'),'YYYY-MM-DD') THEN
        -- RAISE EXCEPTION 'timestamp=%,table=%,found=%',to_timestamp(substring(_result.tablename from '[0-9-]*$'),'YYYY-MM-DD'),_result.tablename,char_length(substring(_result.tablename from '[0-9-]*$'));
        -- could have it check for non-existant ids as well, or for archive bit and only delete if the archive bit is not set
        EXECUTE 'DROP TABLE ' || quote_ident(_result.tablename);
    END IF;
END LOOP;
RETURN NULL;
END;
$_$;
查看更多
登录 后发表回答