I am trying to convert a database column DATA
from varbinary()
to varchar(max)
in SQL Server 2012.
I am using this code to handle the conversion:
SELECT CONVERT(VARCHAR(MAX), DATA) FROM [dbo].[TABLE_NAME]
and the resulting row is as follows :
VW 6501 Çamaşır
I am having trouble with language specific characters (language is Turkish in my case for now)
How do I get over this encoding problem in SQL Server 2012?
Is there a generic way to do this conversion for any language, considering loss of data/encoding problems for any given language?
This may sound like a rookie question but I really would appreciate any suggestions or answer.
Thank you,
In general, SQL Server does not hold UTF-8 in high regard.
However, .NET has methods to do this and you can get at them via CLR integration.
Compile this using C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
namespace UtfLib
{
public static class UtfMethods
{
[SqlFunction(IsDeterministic = true, IsPrecise = true)]
public static SqlBinary NVarCharToUtf8(SqlString inputText)
{
if (inputText.IsNull)
return new SqlBinary(); // (null)
return new SqlBinary(Encoding.UTF8.GetBytes(inputText.Value));
}
[SqlFunction(IsDeterministic = true, IsPrecise = true)]
public static SqlString Utf8ToNVarChar(SqlBinary inputBytes)
{
if (inputBytes.IsNull)
return new SqlString(); // (null)
return new SqlString(Encoding.UTF8.GetString(inputBytes.Value));
}
}
}
Import the assembly into your database and create the external functions:
CREATE ASSEMBLY UtfLib
FROM 'C:\UtfLib.dll'
GO
CREATE FUNCTION NVarCharToUtf8 (@InputText NVARCHAR(MAX))
RETURNS VARBINARY(MAX)
AS EXTERNAL NAME UtfLib.[UtfLib.UtfMethods].NVarCharToUtf8
GO
CREATE FUNCTION Utf8ToNVarChar (@InputBytes VARBINARY(MAX))
RETURNS NVARCHAR(MAX)
AS EXTERNAL NAME UtfLib.[UtfLib.UtfMethods].Utf8ToNVarChar
Last step, you have to enable clr
sp_configure 'clr enabled',1
GO
RECONFIGURE
GO
sp_configure 'clr enabled' -- make sure it took
GO
and voilà!
SELECT dbo.Utf8ToNVarChar(DATA) FROM [dbo].[TABLE_NAME]
try NVARCHAR
Select convert (NVARCHAR(max),DATA) FROM [dbo].[TABLE_NAME]
EDIT:
SQL Server does not support handling of UTF8. To convert your binary value you have to write you own function. The easiest way to do that is to use CLR.
You can use this as a model:
How to return an nvarchar(max) in a CLR UDF?
Just replace the regex code with the Encoding.UTF8.GetString