I have a PostgreSQL v10 database with a size of about 100GB.
What is the most efficient (fastest) way to rebuild statistics, for example after a major version upgrade?
ANALYZE
with no parameters updates statistics for then entire database by default — it's painfully slow! This seems like a single process.
Is there any way to parallelize this to speed it up?
You could use vacuumdb
with the same options that pg_upgrade
suggests:
vacuumdb --all --analyze-in-stages
The documentation describes what it does:
Only calculate statistics for use by the optimizer (no vacuum), like --analyze-only
. Run several (currently three) stages of analyze
with different configuration settings, to produce usable statistics faster.
This option is useful to analyze a database that was newly populated from a restored dump or by pg_upgrade
. This option will try to
create some statistics as fast as possible, to make the database usable, and then produce full statistics in the subsequent stages.
To calculate statistics with several parallel processes, you can use the option -j
of vacuumdb
.