I am trying to insert some Arabic word into the arabic_word
column of my hanswehr2
database Maria DB using the MySQLdb driver.
I was getting a latin-1 encode error
. But after reading around, I found out that the MySQLdb driver was defaulted to latin-1
and I had to explicitly set utf-8
as my charset of choice at the mariadb.connect()
function. Sauce.
The entire database is set to utf-8.
Code:
def insert_into_db(arabic_word, definition):
try:
conn = mariadb.connect('localhost', 'root', 'xyz1234passwd', 'hans_wehr', charset='utf-8', use_unicode=True)
conn.autocommit(True)
cur = conn.cursor()
cur.execute("INSERT INTO hanswehr2 (arabic_word , definition) VALUES (%s,%s)", (arabic_word, definition,))
except mariadb.Error, e:
print e
sys.exit(1)
However now I get the following error:
/usr/bin/python2.7 /home/heisenberg/hans_wehr/main.py
Total lines 87672
(2019, "Can't initialize character set utf-8 (path: /usr/share/mysql/charsets/)")
Process finished with exit code 1
I have specified the Python MySQL driver to use the utf-8 character however it seems to ignore this.
Any inputs would be highly appreciated.
The charset alias for UTF-8 in MySQL is
utf8
(no hyphen).See https://dev.mysql.com/doc/refman/5.5/en/charset-charsets.html for available charsets.
Note, if you need to use non-BMP Unicode points, such as emojis, use
utf8mb4
for the connection charset and the varchar type.There is a thing called collations that helps encode/decode characters for specific languages. https://softwareengineering.stackexchange.com/questions/95048/what-is-the-difference-between-collation-and-character-set
I think u need to specify it when creating your database table or in the connection string. refer this: store arabic in SQL database
More on python mysql connection : https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlconnection-set-charset-collation.html