Delphi 2010: how do I convert a UTF8-encoded PAnsi

2019-07-19 06:02发布

问题:

The situation: I’ve an external DLL that uses UTF-8 as its internal string format. The interface functions all use PAnsiChar to pass strings along.

The rest of my application uses Delphi’s native string type; since I’m working with Delphi 2010, that will map to a UnicodeString.

How can I reliably cast those PAnsiChar arguments (which are pointing to UTF-8 encoded strings) to a UnicodeString?

I had this function, which I thought worked fine:

function PUTF8CharToString(Text: PAnsiChar): string;
var
  UText: UTF8String;
begin
  UText := UTF8String(Text);
  Result := string(UText);
end;

...but now I’ve run into a case where the result string is corrupted; when I save the PAnsiChar to file, it’s fine; but when I save the resulting string after conversion using the above function, it’s corrupted.

Or should this work correctly, and is this indicative of some other memory (de)allocation problem?


Edit: I finally managed to get rid of the memory corruption by assigning the converted string to a local variable string, instead of directly passing it to another function.

回答1:

From System:

function UTF8ToUnicodeString(const S: PAnsiChar): UnicodeString; overload;


UnicodeStr := System.Utf8ToUnicodeString(Text);



回答2:

Try using SetString() instead of casting:

function PUTF8CharToString(Text: PAnsiChar): string;
var
  UText: UTF8String;
begin
  SetString(UText, Text, StrLen(Text));
  Result := UText;
end;