I want to use BDB as a time-series data store, and planning to use the microseconds since epoch as the key values. I am using BTREE as the data store type.
However, when I try to store integer keys, bsddb3 gives an error saying TypeError: Integer keys only allowed for Recno and Queue DB's
.
What is the best workaround? I can store them as strings, but that probably will make it unnecessarily slower.
Given BDB itself can handle any kind of data, why is there a restriction? can I sorta hack the bsddb3 implementation? has anyone used anyother methods?
Well, there's no workaround. But you can use two approaches
Store the integers as string using
str
orrepr
. If the ints are big, you can even use string formattinguse cPickle/pickle module to store and retrieve data. This is a good way if you have data types other than basic types. For basics
int
s andfloat
s this actually is slower and takes more space than just storing stringsYou can't store integers since bsddb doesn't know how to represent integers and which kind of integer it is.
If you convert your integer to a string you will break the lexicographic ordering of keys of bsddb:
10 > 2
but as strings"10" < "2"
.You have to use python struct to convert your integers into a string (or in python 3 into bytes) to store then store them in bsddb. You have to use bigendian packing or ordering will not be correct.
Then you can use bsddb's
Cursor.set_range(key)
to query for information in a given slice of time.For instance,
Cursor.set_range(struct.unpack('>Q', 123456789))
will set the cursor at the key of the even happening at 123456789 or the first that happens after.