How to get the current free disk space in Postgres

2019-04-24 11:58发布

问题:

I need to be sure that I have at least 1Gb of free disk space before start doing some work in my database. I'm looking for something like this:

select pg_get_free_disk_space();

Is it possible? (I found nothing about it in docs).

PG: 9.3 & OS: Linux/Windows

回答1:

PostgreSQL does not currently have features to directly expose disk space.

For one thing, which disk? A production PostgreSQL instance often looks like this:

  • /pg/pg94/: a RAID6 of fast reliable storage on a BBU RAID controller in WB mode, for the catalogs and most important data
  • /pg/pg94/pg_xlog: a fast reliable RAID1, for the transaction logs
  • /pg/tablespace-lowredundancy: A RAID10 of fast cheap storage for things like indexes and UNLOGGED tables that you don't care about losing so you can use lower-redundancy storage
  • /pg/tablespace-bulkdata: A RAID6 or similar of slow near-line magnetic storage used for old audit logs, historical data, write-mostly data, and other things that can be slower to access.
  • The postgreSQL logs are usually somewhere else again, but if this fills up, the system may still stop. Where depends on a number of configuration settings, some of which you can't see from PostgreSQL at all, like syslog options.

Then there's the fact that "free" space doesn't necessarily mean PostgreSQL can use it (think: disk quotas, system-reserved disk space), and the fact that free blocks/bytes isn't the only constraint, as many file systems also have limits on number of files (inodes).

How does aSELECT pg_get_free_disk_space() report this?

Knowing the free disk space could be a security concern. If supported, it's something that'd only be exposed to the superuser, at least.

What you can do is use an untrusted procedural language like plpythonu to make operating system calls to interrogate the host OS for disk space information, using queries against pg_catalog.pg_tablespace and using the data_directory setting from pg_settings to discover where PostgreSQL is keeping stuff on the host OS. You also have to check for mount points (unix/Mac) / junction points (Windows) to discover if pg_xlog, etc, are on separate storage. This still won't really help you with space for logs, though.

I'd quite like to have a SELECT * FROM pg_get_free_diskspace that reported the main datadir space, and any mount points or junction points within it like for pg_xlog or pg_clog, and also reported each tablespace and any mount points within it. It'd be a set-returning function. Someone who cares enough would have to bother to implement it for all target platforms though, and right now, nobody wants it enough to do the work.


In the mean time, if you're willing to simplify your needs to:

  • One file system
  • Target OS is UNIX/POSIX-compatible like Linux
  • There's no quota system enabled
  • There's no root-reserved block percentage
  • inode exhaustion is not a concern

then you can CREATE LANGUAGE plpython3u; and CREATE FUNCTION a LANGUAGE plpython3u function that does something like:

import os
st = os.statvfs(datadir_path)
return st.f_bavail * st.f_frsize

in a function that returns bigint and either takes datadir_path as an argument, or discovers it by doing an SPI query like SELECT setting FROM pg_settings WHERE name = 'data_directory' from within PL/Python.

If you want to support Windows too, see Cross-platform space remaining on volume using python . I'd use Windows Management Interface (WMI) queries rather than using ctypes to call the Windows API though.

Or you could use this function someone wrote in PL/Perlu to do it using df and mount command output parsing, which will probably only work on Linux, but hey, it's prewritten.



回答2:

Here's a plpython2u implementation we've been using for a while.

-- NOTE this function is a security definer, so it carries the superuser permissions
-- even when called by the plebs.
-- (required so we can access the data_directory setting.)
CREATE OR REPLACE FUNCTION get_tablespace_disk_usage()
    RETURNS TABLE (
        path VARCHAR,
        bytes_free BIGINT,
        total_bytes BIGINT
    )
AS $$
import os

data_directory = plpy.execute("select setting from pg_settings where name='data_directory';")[0]['setting']
records = []

for t in plpy.execute("select spcname, spcacl, pg_tablespace_location(oid) as path from pg_tablespace"):
    if t['spcacl']:
        # TODO handle ACLs. For now only show public tablespaces.
        continue

    name = t['spcname']
    if name == 'pg_default':
        path = os.path.join(data_directory, 'default')
    elif name == 'pg_global':
        path = os.path.join(data_directory, 'global')
    else:
        path = t['path']

    # not all tablespaces actually seem to exist(?) in particular, pg_default.
    if os.path.exists(path):
        s = os.statvfs(path)
        total_bytes = s.f_blocks * s.f_frsize
        bytes_free = s.f_bavail * s.f_frsize

        records.append((path, bytes_free, total_bytes))

return records

$$ LANGUAGE plpython2u STABLE SECURITY DEFINER;

Usage is something like:

SELECT path, bytes_free, total_bytes FROM get_tablespace_disk_usage();


回答3:

C version for those who still want a tool to check free space on postgresql server. Only for Linux and FreeBSD currently, need to add proper headers and defines for other OSes.

#if defined __FreeBSD__
# include <sys/param.h>
# include <sys/mount.h>
#elif defined __linux__
# define _XOPEN_SOURCE
# define _BSD_SOURCE
# include <sys/vfs.h>
#else
# error Unsupported OS
#endif
#include <postgres.h>
#include <catalog/pg_type.h>
#include <funcapi.h>
#include <utils/builtins.h>

/* Registration:
CREATE FUNCTION disk_free(path TEXT) RETURNS TABLE (
  size BIGINT, free BIGINT, available BIGINT, inodes INTEGER, ifree INTEGER, blksize INTEGER
) AS '$pglib/pg_df.so', 'df' LANGUAGE c STRICT;
*/

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

PG_FUNCTION_INFO_V1(df);

Datum df(PG_FUNCTION_ARGS)
{
  TupleDesc tupdesc;
  AttInMetadata *attinmeta;
  HeapTuple tuple;
  Datum result;
  char **values;
  struct statfs sfs;
  const char* path = text_to_cstring(PG_GETARG_TEXT_P(0));

  if(get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
    ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("function returning record called in context that cannot accept type record")));
  attinmeta = TupleDescGetAttInMetadata(tupdesc);

  if(0 != statfs(path, &sfs))
    ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), errmsg("statfs() system call failed: %m")));

  values = (char **) palloc(6 * sizeof(char *));
  values[0] = (char *) palloc(20 * sizeof(char));
  values[1] = (char *) palloc(20 * sizeof(char));
  values[2] = (char *) palloc(20 * sizeof(char));
  values[3] = (char *) palloc(10 * sizeof(char));
  values[4] = (char *) palloc(10 * sizeof(char));
  values[5] = (char *) palloc(10 * sizeof(char));

  int64 df_total_bytes = sfs.f_blocks * sfs.f_bsize;
  int64 df_free_bytes  = sfs.f_bfree  * sfs.f_bsize;
  int64 df_avail_bytes = sfs.f_bavail * sfs.f_bsize;
  snprintf(values[0], 20, "%lld", df_total_bytes);
  snprintf(values[1], 20, "%lld", df_free_bytes);
  snprintf(values[2], 20, "%lld", df_avail_bytes);
  snprintf(values[3], 10, "%d", sfs.f_files);
  snprintf(values[4], 10, "%d", sfs.f_ffree);
  snprintf(values[5], 10, "%d", sfs.f_bsize);

  tuple = BuildTupleFromCStrings(attinmeta, values);
  return HeapTupleGetDatum(tuple);
}