I have a query that generates XML files and loads them to FTP with <?xml version="1.0"?>
.
I need to switch encoding to UTF-8 as follows:
<?xml version="1.0" encoding="utf-8"?>
I can do it manually in the text editor. But cannot do it in SQL Server.
I also read this article but that did not contribute in finding solution for the issue.
https://docs.microsoft.com/en-us/sql/relational-databases/xml/create-instances-of-xml-data
My code:
USE [Audit_DBA]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- to call the procedure use the code below but assign your own path:
-- EXEC [Audit_DBA].[dbo].[StateAndLocalAward_ToXML] 'C:\NLP\Test\NewFolder'
CREATE PROCEDURE [dbo].[StateAndLocalAward_ToXML_VC]
@FileDestinationDir VARCHAR(2000)
AS
DECLARE @FilePath VARCHAR(4000)
DECLARE @SQLStr VARCHAR(8000),
@Cmd VARCHAR(1000),
@Ret INT
IF EXISTS (SELECT * FROM InputTemp.SYS.TABLES WHERE NAME = 'StateAndLocalAward_output')
DROP TABLE InputTemp.dbo.StateAndLocalAward_output;
;WITH CTE_StateAndLocalAward_output AS
(
SELECT TOP 200
sc.stateContractId,
ca.OnviaId AS [Reference], --AS OnviaID,
ca.contractTitle AS [Title],
nigp.NIGPCodeTx AS [NIPGCode],
nigp.NIGPDescriptionLongTx AS [NIPGDescription],
ca.Description,
vm.parentName AS [VendorName],
offering.offeringTx AS [PrimaryOffering],
ca.StartDate,
ca.AwardDt
FROM
ISCore.dbo.StateContract sc WITH (NOLOCK)
INNER JOIN
ISCore.dbo.ContractAward ca WITH (NOLOCK) ON sc.contractAwardId = ca.contractAwardID
INNER JOIN
ISLibrary.dbo.VendorMaster vm WITH (NOLOCK) ON ca.vendorId = vm.vendorID
LEFT OUTER JOIN
ISCore.dbo.CompanyProfile cp WITH (NOLOCK) ON ca.vendorId = cp.vendorId
LEFT OUTER JOIN
ISCore.dbo.Offering offering WITH (NOLOCK) ON ca.offeringID = offering.offeringID
LEFT OUTER JOIN
ISCore.dbo.NIGPSrvc nigp WITH (NOLOCK) ON ca.NIGPCode = nigp.NIGPCodeTx
WHERE
vm.showUnverified = 1 AND sc.stateContractId = -- 464482 stateContractId
AND StartDate >= DATEADD(month, -12, GETDATE())
)
SELECT
*
INTO
InputTemp.dbo.StateAndLocalAward_output
FROM
CTE_StateAndLocalAward_output;
--select * from InputTemp.dbo.StateAndLocalAward_output
DECLARE @StateContractId INT;
--DECLARE @xml XML;
DECLARE Cur_StateAndLocalAward_StateContractId CURSOR FOR
SELECT StateContractId
FROM inputtemp.dbo.StateAndLocalAward_output t1 WITH (NOLOCK)
OPEN Cur_StateAndLocalAward_StateContractId;
FETCH NEXT FROM Cur_StateAndLocalAward_StateContractId INTO @StateContractId -- @xml;
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT
@SQLStr = 'SELECT Body.[Reference], Body.[Title], Body.[NIPGCode], Body.[NIPGDescription], Body.[Description], Body.[VendorName], Body.[PrimaryOffering] FROM InputTemp.dbo.StateAndLocalAward_output AS Body where StateContractId = ''' + str(@StateContractId) + ''' FOR XML AUTO, ELEMENTS;'
--SELECT N'<?xml version="1.0" encoding="UTF-8"?>'
SELECT
CAST((SELECT N'<?xml version="1.0" encoding="UTF-8"?>' + (@SQLSTr)) as varchar(8000) /*as XML*/) as SQLStr
-- select CAST((SELECT N'<?xml version="1.0" encoding="UTF-8"?>' +
-- (@SQLSTr)) as XML) as SQLStr
If I run this code instead of the same above but cast as XML type
, I get an error:
Msg 9402, Level 16, State 1, Procedure StateAndLocalAward_ToXML, Line 86 [Batch Start Line 10]
XML parsing: line 1, character 38, unable to switch the encoding
select @FilePath = @FileDestinationDir+'\NewFolder'+ltrim(rtrim(str(@StateContractId)))+'.xml'
select @Cmd = ' bcp " ' + @SQLStr + '" queryout '+@FilePath+' -w -r "" -T -S ' +@@ServerName
exec @Ret = master.dbo.xp_cmdshell @Cmd
FETCH NEXT FROM Cur_StateAndLocalAward_StateContractId INTO @StateContractId -- @xml;
END
CLOSE Cur_StateAndLocalAward_StateContractId;
DEALLOCATE Cur_StateAndLocalAward_StateContractId;
GO
There are some things to know:
<?xml blah ?>
) to a native SQL-Server XML typed variable or column. This will either fail ("...switch the encoding") or the xml-declaration will disappear.NVARCHAR(MAX)
. But you cannot re-cast (re-convert) this to an XML without failing or losing the declaration.UCS-2
(very close toUTF-16
) in any case.VARCHAR
(CHAR
) type is notutf-8
but extended ASCII (depending on aCOLLATION
)encoding="x"
and store the file with an encoding of y - but you shouldn't.utf-8
declaration to XML when it isVARCHAR
and it will cast a string withutf-16
when it isNVARCHAR
, but you cannot cross this (Read this related answer). Other encodings very likely lead to cannot switch the encoding error.About your code
@SQLStr
and@cmd
toNVARCHAR(MAX)
, othewise you might get in troubles with non-plain-latin characters.CURSOR
, you should fill an XML-typed variable with the result of your statement, cast this toNVARCHAR(MAX)
and add the declaration to this string. Do not cast the result back to XML.BCP
docs. Stating-w
will writeunicode
(wide), which is notutf-8
(what you write into the declaration has no effect here).Hint:
Read this related answer, showing utf-8 export with BCP on SQL-Server 2016