I'm trying to store a numpy array of about 1000 floats in a sqlite3 database but I keep getting the error "InterfaceError: Error binding parameter 1 - probably unsupported type".
I was under the impression a BLOB data type could be anything but it definitely doesn't work with a numpy array. Here's what I tried:
import sqlite3 as sql
import numpy as np
con = sql.connect('test.bd',isolation_level=None)
cur = con.cursor()
cur.execute("CREATE TABLE foobar (id INTEGER PRIMARY KEY, array BLOB)")
cur.execute("INSERT INTO foobar VALUES (?,?)", (None,np.arange(0,500,0.5)))
con.commit()
Is there another module I can use to get the numpy array into the table? Or can I convert the numpy array into another form in Python (like a list or string I can split) that sqlite will accept? Performance isn't a priority. I just want it to work!
Thanks!
You could register a new
array
data type withsqlite3
:With this setup, you can simply insert the NumPy array with no change in syntax:
And retrieve the array directly from sqlite as a NumPy array:
Happy Leap Second has it close but I kept getting an automatic casting to string. Also if you check out this other post: a fun debate on using buffer or Binary to push non text data into sqlite you see that the documented approach is to avoid the buffer all together and use this chunk of code.
I haven't heavily tested this in python 3, but it seems to work in python 2.7
This works for me:
I think that
matlab
format is a really convenient way to store and retrieve numpy arrays. Is really fast and the disk and memory footprint is quite the same.(image from mverleg benchmarks)
But if for any reason you need to store the numpy arrays into SQLite I suggest to add some compression capabilities.
The extra lines from unutbu code is pretty simple
The results testing with MNIST database gives were:
using
zlib
, andusing
bz2
Comparing
Matlab V5
format withbz2
on SQLite, the bz2 compression is around 2.8, but the access time is quite long comparing to Matlab format (almost instantaneously vs more than 30 secs). Maybe is worthy only for really huge databases where the learning process is much time consuming than access time or where the database footprint is needed to be as small as possible.Finally note that
bipz/zlib
ratio is around 3.7 andzlib/matlab
requires 30% more space.The full code if you want to play yourself is: