How to avoid encoding warning when inserting binar

2020-05-09 01:19发布

问题:

I'm having trouble inserting binary data into a longblob column in MySQL using MySQLdb from Python 2.7, but I'm getting an encoding warning that I don't know how to get around:

./test.py:11: Warning: Invalid utf8 character string: '8B0800'
  curs.execute(sql, (blob,))

Here is the table definition:

CREATE TABLE test_table (
  id int(11) NOT NULL AUTO_INCREMENT,
  gzipped longblob,
  PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

And the test code:

#!/usr/bin/env python

import sys
import MySQLdb

blob = open("/tmp/some-file.gz", "rb").read()
sql = "INSERT INTO test_table (gzipped) VALUES (%s)"

conn = MySQLdb.connect(db="unprocessed", user="some_user", passwd="some_pass", charset="utf8", use_unicode=True)
curs = conn.cursor()
curs.execute(sql, (blob,))

I've searched here and elsewhere for the answer, but unfortunately although many questions seem like they are what I'm looking for, the posters don't appear to be having encoding issues.

Questions:

  1. What is causing this warning?
  2. How do I get rid of it?

回答1:

After some more searching I've found the answers.

  1. It is actually MySQL generating this warning.
  2. It can be avoided by using _binary before the binary parameter.

https://bugs.mysql.com/bug.php?id=79317

So the Python code needs to be updated as follows:

sql = "INSERT INTO test_table (gzipped) VALUES (_binary %s)"