Use integer keys in Berkeley DB with python (using

2019-08-29 02:35发布

问题:

I want to use BDB as a time-series data store, and planning to use the microseconds since epoch as the key values. I am using BTREE as the data store type.

However, when I try to store integer keys, bsddb3 gives an error saying TypeError: Integer keys only allowed for Recno and Queue DB's.

What is the best workaround? I can store them as strings, but that probably will make it unnecessarily slower.

Given BDB itself can handle any kind of data, why is there a restriction? can I sorta hack the bsddb3 implementation? has anyone used anyother methods?

回答1:

You can't store integers since bsddb doesn't know how to represent integers and which kind of integer it is.

If you convert your integer to a string you will break the lexicographic ordering of keys of bsddb: 10 > 2 but as strings "10" < "2".

You have to use python struct to convert your integers into a string (or in python 3 into bytes) to store then store them in bsddb. You have to use bigendian packing or ordering will not be correct.

Then you can use bsddb's Cursor.set_range(key) to query for information in a given slice of time.

For instance, Cursor.set_range(struct.unpack('>Q', 123456789)) will set the cursor at the key of the even happening at 123456789 or the first that happens after.



回答2:

Well, there's no workaround. But you can use two approaches

  1. Store the integers as string using str or repr. If the ints are big, you can even use string formatting

  2. use cPickle/pickle module to store and retrieve data. This is a good way if you have data types other than basic types. For basics ints and floats this actually is slower and takes more space than just storing strings