Using SQLite3 in Python, I am trying to store a compressed version of a snippet of UTF-8 HTML code.
Code looks like this:
...
c = connection.cursor()
c.execute('create table blah (cid integer primary key,html blob)')
...
c.execute('insert or ignore into blah values (?, ?)',(cid, zlib.compress(html)))
At which point at get the error:
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.
If I use 'text' rather than 'blob' and don't compress the HTML snippet, it works all fine (db is to large though). When I use 'blob' and compress via Python zlib library, I get the above error message. I looked around but couldn't find a simple answer for this one.
If you want to use 8-bit strings instead of unicode string in sqlite3, set approptiate text_factory for sqlite connection:
connection = sqlite3.connect(...)
connection.text_factory = str
Found the solution, I should have spent just a little more time searching.
Solution is to 'cast' the value as a Python 'buffer', like so:
c.execute('insert or ignore into blah values (?, ?)',(cid, buffer(zlib.compress(html))))
Hopefully this will help somebody else.
In order to work with the BLOB type, you must first convert your zlib compressed string into binary data - otherwise sqlite will try to process it as a text string. This is done with sqlite3.Binary(). For example:
c.execute('insert or ignore into blah values (?, ?)',(cid,
sqlite3.Binary(zlib.compress(html))))
You could store the value using repr(html) instead of the raw output and then use eval(html) when retrieving the value for use.
c.execute('insert or ignore into blah values (?, ?)',(1, repr(zlib.compress(html))))
Syntax:
5 types of possible storage: NULL, INTEGER, TEXT, REAL and BLOB
BLOB is generally used to store pickled models or dill pickled models
> cur.execute('''INSERT INTO Tablename(Col1, Col2, Col3, Col4) VALUES(?,?,?,?)''',
[TextValue, Real_Value, Buffer(model), sqlite3.Binary(model2)])
> conn.commit()
> # Read Data:
> df = pd.read_sql('SELECT * FROM Model, con=conn)
> model1 = str(df['Col3'].values[0]))
> model2 = str(df['Col'].values[0]))