I have a table in hive built using the following command:
create table t1 (x int, y int, s string) partitioned by (wk int) stored as sequencefile;
The table has the data below:
select * from t1;
+-------+-------+-------+--------+--+
| t1.x | t1.y | t1.s | t1.wk |
+-------+-------+-------+--------+--+
| 1 | 2 | abc | 10 |
| 4 | 5 | xyz | 11 |
| 7 | 8 | pqr | 12 |
+-------+-------+-------+--------+--+
Now the ask is to drop the oldest partition when partition count is >=2
Can this be handled in hql or through any shell script and how?
Considering I will be using dbname as variable like hive -e 'use "$dbname"; show partitions t1
If your partitions are ordered by date, you could write a shell script in which you could use
hive -e 'SHOW PARTITIONS t1'
to get all partitions, in your example, it will return:Then you can issue
hive -e 'ALTER TABLE t1 DROP PARTITION (wk=10)'
to remove the first partition;So something like: