How to decode base64 unicode string using T-SQL

2019-08-11 20:37发布

问题:

Can't decode turkish characters in base64 string.

Base64 string = "xJ/DvGnFn8Onw7bDlsOHxLDEnsOcw5w="

When I decode it must be like this : 'ğüişçöÖÇİĞÜÜ' I try to decode like this :

SELECT CAST(
          CAST(N'' AS XML).value('xs:base64Binary("xJ/DvGnFn8Onw7bDlsOHxLDEnsOcw5w=")' , 'VARBINARY(MAX)') 
          AS NVARCHAR(MAX)
           ) UnicodeEncoding ;

Based on this answer : Base64 encoding in SQL Server 2005 T-SQL

But have response like this : '鿄볃앩쎟쎧쎶쎖쒇쒰쎞쎜'

Base64 string is correct because when I try decode in Base64decode.org it works.

Is there any way to decode turkish characters?

回答1:

Your base-64 encoded data contains an UTF-8 string. MS SQL doesn't support UTF-8, only UTF-16, so it fails for any characters outside of ASCII.

The solution is to either send the data as nvarchar right away, or to encode the string as UTF-16 (and send it as varbinary or base-64, as needed).

Based on Erlang documentation, this might require an external library, unicode: http://www.erlang.org/doc/apps/stdlib/unicode_usage.html

Basically, the default seems to be UTF-8, you need to specify UTF-16 manually. UTF-16 support seems a bit clunky, but it should be quite doable.