Reading the Amazon Redshift documentatoin I ran a VACUUM on a certain 400GB table which has never been vacuumed before, in attempt to improve query performance.
Unfortunately, the VACUUM has caused the table to grow to 1.7TB (!!) and has brought the Redshift's disk usage to 100%.
I then tried to stop the VACUUM by running a CANCEL query in the super user queue (you enter it by running "set query_group='superuser';") but although the query didn't raise an error, this had no effect on the vaccum query which keeps running.
What can I do?
I have stopped vacuum operation several times. Maybe the feature was not available that time.
Run the below query, which gives you the process id for vacuum query.
select * from stv_recents where status='Running';
Once you have process id you can run the following query to terminate the process.
select pg_terminate_backend( pid );
Apparently, currently there is not much you can do.
I was on the phone with amazon support for an hours, they didn't have the tools to stop the vacuum operation.
They opened a ticket about CANCEL query silently not working on VACUUM queries.
They suggested I take snapshot of the cluster (normally should take a few minutes if you have made previous snapshots), and then that I restart the cluster.
It sort of worked, meaning that the vacuum stopped, and some of the disk space was cleared (600GB), but the table remained more than twice its original size. Because vacuuming it again would be too risky, I resorted to creating a deep copy of it, which should created a vacuumed copy of the table.
(You can read about deep copy here - http://docs.aws.amazon.com/redshift/latest/dg/performing-a-deep-copy.html).
Hint: Run this query: (taken from here) to see what tables you should vacuum.
Note: This will help only in the case where you want to know which tables are big, and what you can gain by vacuum
ing each one.
select trim(pgdb.datname) as Database,
trim(a.name) as Table, ((b.mbytes/part.total::decimal)*100)::decimal(5,2) as pct_of_total, b.mbytes, b.unsorted_mbytes
from stv_tbl_perm a
join pg_database as pgdb on pgdb.oid = a.db_id
join (select tbl, sum(decode(unsorted, 1, 1, 0)) as unsorted_mbytes, count(*) as mbytes
from stv_blocklist group by tbl) b on a.id=b.tbl
join ( select sum(capacity) as total
from stv_partitions where part_begin=0 ) as part on 1=1
where a.slice=0
order by 3 desc, db_id, name;
Then vacuum table(s) with high unsorted_mbytes
: VACUUM your_table;