Get size of Rethinkdb database with Python

2019-08-02 21:15发布

问题:

How do I get the size of a given rethinkdb database using Python? I want this because I'm developing a mutli-user graphical frontend to rethinkdb and want to be able to enforce a quota for each user's database.

Something like below would be awesome:

r.db('thedatabase').size().run()
50gb

回答1:

RethinkDB doesn't have a built-in command for such operation.

The easiest solution would be probably to spin up multiple RethinkDB instances on their own (limited) partition (using Docker would probably make things easier here).



回答2:

I know this is a late answer and not quite as pretty as what you asked for, but just dropping this here for people looking for this functionality in the future:

One way of doing this would be by accessing the RethinkDB virtual db rethinkdb. Inside there is a stats table which includes lots of information about database usage.

r.db('rethinkdb')
    .table('stats')
    .filter(
        r.row('id').contains('table_server')
    )('storage_engine')('disk')('space_usage')('data_bytes')
    .reduce((left, right) => left.add(right))

This query will retrieve the size of all tables in all databases over all nodes combined. In essence it just reads the usage for each individual individual table from the stat object and adds it up.

Note that this gets the combined usage of all nodes in a sharding setup (I think).

Filter by server:

r.db('rethinkdb')
    .table('stats')
    .filter(r.and(
        r.row('id').contains('table_server'),
        r.row('server').eq('servername')
    )
    )('storage_engine')('disk')('space_usage')('data_bytes')
    .reduce((left, right) => left.add(right))

Essentially the same as the first one just with an additional filter condition in form of an and

Filter by database:

r.db('rethinkdb')
    .table('stats')
    .filter(r.and(
        r.row('id').contains('table_server'),
        r.row('db').eq('dbname')
    )
    )('storage_engine')('disk')('space_usage')('data_bytes')
    .reduce((left, right) => left.add(right))

Same as above, just with db.

r.db('rethinkdb')
    .table('stats')
    .filter(r.and(
        r.row('id').contains('table_server'),
        r.row('db').eq('dbname'),
        r.row('table').eq('tablename')
    )
    )('storage_engine')('disk')('space_usage')('data_bytes')

It's almost the same for filtering by table name. Just remember that if you're targeting a specific table in a database you should make sure to also add a filter for the database. Otherwise two tables with the same name will return two size values. Also note that we only expect one result, so no reduce needed.

In the end just remember that they're just ReQL queries.

Also note that this uses a JavaScript arrow (=>) function. In Python you can just replace that with .reduce(lambda left, right: left+right).

If you have any improvements, please do comment :)