I have to generate xml file for the format of encoding="UTF-8. I used bcp queryout. xml file was generating. But my issue was the xml file validation fail. could any one help me to resolve this? This is my code.
DECLARE @xmlBody AS VARCHAR(MAX)= '<?xml version="1.0" encoding="UTF-8"?>'+ CAST(@xmlStr AS VARCHAR(MAX))
INSERT INTO [dbo].[MasterXml]
([PurchaseOrderID] ,[Code])
values (@PurchaseOrderID, @xmlBody)
Declare @command varchar(8000)
SET @command= 'bcp "SELECT TOP 1 [Code] from [tec_Dev].[dbo].[MasterXml] where PurchaseOrderID='+
CAST( @PurchaseOrderID As varchar(20))+'" queryout '
+@uploadFolder + CAST(@PurchaseOrderID AS varchar(20))+'.xml' +' -T -N -w -c -C65001'
print @command
EXEC xp_cmdshell @command
I found solution
+@uploadFolder + CAST(@PurchaseOrderID AS varchar(20))+'.xml' +' -T -N -w -c -C65001'
I remove the -w from the code. then it works. But it works only for
Microsoft SQL Server 2014 (SP2-CU6) (KB4019094) - 12.0.5553.0 (X64).
https://support.microsoft.com/en-us/help/3136780/utf-8-encoding-support-for-the-bcp-utility-and-bulk-insert-transact-sq
Looking at your code I get this is [sql-server]
...
SQL-Server's VARCHAR
-type is 1-byte-encoded extended ASCII (depending on the associated COLLATION
) and not UTF-8
...
Whenever SQL-Server is talking about unicode it is the native NVARCHAR
or NCHAR
type, which is a 2-byte-encoded string (UCS-2
actually).
Furthermore you are stating some switches
-N
: Use the native types in any case, use unicode for characters.
-w
: Explicit switch to use unicode
-C
: with page 65001 (utf-8
), while the documentation explicitly tells you, that this codepage is not supported!.
If the rest is okay, your code should place a file on disk which is 2-byte-unicode, but is yelling: "I'm UTF-8!". Try the same with UTF-16
in the XML's declaration. Alternatively you can use any other tool to convert SQL-Server's output to the encoding of your choice.
I'd tend to C#: Read the XML directly into a .NET-string (always unicode!) from SQL-Server (no BCP at all) and write the .Net-string to a file using the proper encoding.
UPDATE
At the given link (BCP documentation for SQL Server 2014) there is still
** Important ** SQL Server does not support code page 65001 (UTF-8 encoding).
But for version 2016+ you can find this:
Versions prior to version 13 ( SQL Server 2016 (13.x)) do not support
code page 65001 (UTF-8 encoding). Versions beginning with 13 can
import UTF-8 encoding to earlier versions of SQL Server.
But 2014-SP2 adds this to v2014 actually (KB-Article 3136780). So the documentation is not covering this backward enhancement.
It should work to export UTF-8 using this combination without the -N -w
:
-c -C65001