I'm developing a DataSnap server that make some query to db MySQL 5.1, tables are UTF8 and I saw correct latin, chinese, japanese, russian, greek, hindi text in tables.
When query the table I recieve "?????..." as response.
I have set names UTF8:
SQLConnection1.Connected:= True;
SQLConnection1.ExecuteDirect('set names utf8;');
then I do this:
with DMMySQL.SQLDataSet1 do
begin
if Locate('COD_GOOGLE', LanguageValue, []) then
Result:= UTF8Decode(FieldByName('DESCRIPTION').AsString); //
end;
and Result always is "?????...." with or without UTF8Decode
.
Where I'm wrong ?
TIA
When you set the character set for a field/table, usually this is meant to ensure data are stored without lossy conversions, a proper collation is used, and some encondings may save space also. For example UTF-8 will store efficiently English strings (and most Western languages), and will be increasingly less efficient with other languages, compared to UTF-16.
Usually the driver - if set correctly - will translate the string from the database character set to the client application character set. You don't say what you're using to access MySQL (dbExpress? ODBC? Other?). Because Delphi Unicode strings are UTF-16, I guess the driver by default will tell MySQL to convert data to and from the UTF-16 enconding used by Delphi, regardless whatever encoding is used to store them inside MYSQL table.
If you tell MySQL to return data in UTF-8 and the driver or Delphi database code still store those data anyway into a UTF-16 string without converting it, you won't get the expected result.
What happens if you do not set a character set, or use a UTF-16 encoding?
Update: RAD Studio help says (ms-help://embarcadero.rs2010/rad/DbExpress_Data_Type_Mapping_for_Supported_Databases.html and http://docwiki.embarcadero.com/RADStudio/en/DbExpress_Data_Type_Mapping_for_Supported_Databases#MYSQL):
dbExpress has the following MYSQL data type mappings:
char TDBXDataTypes.WideStringType if the server uses UTF-8; otherwise
TDBXDataTypes.AnsiStringType
varchar TDBXDataTypes.WideStringType if the server uses UTF-8; otherwise TDBXDataTypes.AnsiStringType
Thereby if the server uses UTF-8 dbExpress should handle data in Delphi UTF-16, otherwise in an ANSI encoding. What kind of TField your application returns?
Did you try FieldByName('unicodefield').AsWideString
? IIRC dbExpress returns AnsiString with AsString.