Unix time in PARTITION BY for Vertica

2019-07-21 13:11发布

问题:

I have Big table in vertica which has time_stamp (int) as unix timestamp. I want to partition this table on week basis (week start day Monday).

Is there a better way to do this in one step rather than converting time_stamp from unix to TIMESTAMP (Vertica) then doing partitions ?

回答1:

Optimally, you should be using the date/time type. You won't be able to use non-deterministic functions such as TO_TIMESTAMP in the PARTITION BY expression. The alternative is to use math to logically create the partitions:

Using a Unix timestamp to partition by:

              Divide By
Minutes       60
Hours         60 * 60 (3600)
Days          60 * 60 * 24 (86400)
Weeks         60 * 60 * 24 * 7 (604800)

If we use 604800, this will give you the week number from January 1, 1970 00:00:00 UTC.

Let's set up a test table:

CREATE TABLE public.test (
  time_stamp int NOT NULL
);

INSERT INTO public.test (time_stamp) VALUES (1404305559);
INSERT INTO public.test (time_stamp) VALUES (1404305633);
INSERT INTO public.test (time_stamp) VALUES (1404305705);
INSERT INTO public.test (time_stamp) VALUES (1404305740);
INSERT INTO public.test (time_stamp) VALUES (1404305778);

COMMIT;

Let's create the partition:

ALTER TABLE public.test PARTITION BY FLOOR(time_stamp/604800) REORGANIZE;

We then get:

NOTICE 4954:  The new partitioning scheme will produce 1 partitions
WARNING 6100:  Using PARTITION expression that returns a Numeric value
HINT:  This PARTITION expression may cause too many data partitions.  Use of an expression that returns a more accurate value, such as a regular VARCHAR or INT, is encouraged
NOTICE 4785:  Started background repartition table task
ALTER TABLE

You'll also want to be mindful of how many partitions this creates. Vertica recommends keeping the number of partitions between 10-20.



标签: vertica