I'm using import.sql to write my development data to DB. I'm using MySQL Server 5.5 and my persistence.xml is here:
<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.0"
xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
<persistence-unit name="MobilHM" transaction-type="RESOURCE_LOCAL">
<provider>org.hibernate.ejb.HibernatePersistence</provider>
<class>tr.com.stigma.db.entity.Doctor</class>
<class>tr.com.stigma.db.entity.Patient</class>
<class>tr.com.stigma.db.entity.Record</class>
<class>tr.com.stigma.db.entity.User</class>
<properties>
<property name="hibernate.hbm2ddl.auto" value="create" />
<property name="hibernate.show_sql" value="true" />
<property name="hibernate.format_sql" value="true" />
<!-- Auto detect annotation model classes -->
<property name="hibernate.archive.autodetection" value="class" />
<!-- Datasource -->
<property name="hibernate.connection.driver_class" value="com.mysql.jdbc.Driver" />
<property name="hibernate.connection.username" value="mobilhm" />
<property name="hibernate.connection.password" value="mobilhm" />
<property name="hibernate.connection.url" value="jdbc:mysql://localhost/mobilhm" />
<property name="hibernate.dialect" value="org.hibernate.dialect.MySQLDialect" />
</properties>
</persistence-unit>
Some characters in my import.sql is not shown correctly in DB. For example character ü becomes ü in db. Default charset in mysql is utf-8 and I'm creating tables like
CREATE TABLE doctor (doctorId int unsigned NOT NULL AUTO_INCREMENT, name varchar(45) NOT NULL, surname varchar(45) NOT NULL, PRIMARY KEY (doctorId)) ENGINE=InnoDB DEFAULT CHARSET=utf8;
It is weird that if I import using Mysql import/export manager data is correct but using hibernate.hbm2ddl.auto=create makes characters corrupted.
How can I solve this?
Edit: Also I've tried adding
<property name="hibernate.connection.useUnicode" value="true" />
<property name="hibernate.connection.characterEncoding"
value="UTF-8" />
<property name="hibernate.connection.charSet" value="UTF-8" />
to persistence.xml. But it didn't help.
Fix: I've solved it eventually. I'm using Tomcat and that is the point of corruption not hibernate or mysql. I've started it with set JAVA_OPTS=-Dfile.encoding=UTF-8 command and my problem goes away.
The title of question became misleading now. Sorry for that.
Here's a reliable solution without setting any system property.
We assume that import files are encoded with
UTF-8
but Java default charset is different, let's saylatin1
.1) Define a custom class for import_files_sql_extractor hibernate.hbm2ddl.import_files_sql_extractor=com.pragmasphere.hibernate.CustomSqlExtractor
2) fix the invalid strings read by hibernate in the implementation.
You can change value of
SOURCE_CHARSET
with another encoding used by import files.When creating the reader for that file, Hibernate uses
new InputStreamReader(stream);
directly, without explicit encoding (the default execution platform charset encoding is assumed/used).So, in other words, your
import.sql
file must be in the default execution platform charset encoding.There is an old (2006!) open issue for this, in case one wishes to send a patch: https://hibernate.atlassian.net/browse/HBX-711
Options to fix:
Add
-Dfile.encoding=UTF-8
to theJAVA_OPTS
environment variable, such as:Set a property in your Maven plugin (could be
surefire
,failsafe
or other, depending on how do you run the code that imports the hibernate file). Example forsurefire
:If gradle: To add this property in gradle add
systemProperty 'file.encoding', 'UTF-8'
to task configuration block. (Thanks @meztihn)I'm using import.sql to populate database on test phase and this link has helped me to solve encoding problem: http://javacimrman.blogspot.ru/2011/07/hibernate-importsql-encoding-when.html.