A piece of tsql code doesnt behave the same from production to Test environment. When the code below is executed on prod it brings back data
SELECT [col1xml]
FROM [DBName].[dbo].[Table1] (NOLOCK)
WHERE (cast([col1xml] as xml).value('(/Payment/****/trn1)[1]','nvarchar(20)') ='123456'))
However that same code brings back the below error when ran in Test.
Msg 9402, Level 16, State 1, Line 9 XML parsing: line 1, character 38, unable to switch the encoding
I have seen the fix provided by this site of conversion of UTF and this works in both prod and test. See below. However i need to provide an answer to the developers of why this behavior is occurring and a rationale why they should change their code(if that is the case)
WHERE CAST(
REPLACE(CAST(col1xml AS VARCHAR(MAX)), 'encoding="utf-16"', 'encoding="utf-8"')
AS XML).value('(/Payment/****/trn1)[1]','NVARCHAR(max)') ='123456')
I have compared both DB's and looked for anything obvious such as ansi nulls and ansi padding. Everything is the same and the version of SQL Server. This is SQL SERVER 2012 11.0.5388 version. Data between environments is different but the table schema is identical and the data type for col1xml is ntext.
In SQL Server you should store XML in a column typed
XML
. This native type has a got a lot of advantages. It is much faster and has implicit validity checks.From your question I take, that you store your XML in
NTEXT
. This type is deprecated for centuries and will not be supported in future versions! You ought to change this soon!SQL-Server knows two kinds of strings:
CHAR
orVARCHAR
), which is extended ASCIIImportant: This is not UTF-8! Native UTF-8 support will be part of a coming version.
NCHAR
orNVARCHAR
), which is UTF-16 (UCS-2)If the XML has a leading declaration with an encoding (in most cases this is
utf-8
orutf-16
) you can get into troubles.If the XML is stored as 2-byte-string (at least the
NTEXT
tells me this), the declaration has to beutf-16
. With a 1-byte-string it should beutf-8
.The best (and easiest) was to ommit the declaration completely. You do not need it. Storing the XML in the appropriate type will kill this declaration automatically.
What you should do: Create a new column of type
XML
and shuffle all your XMLs to this column. Get rid of anyTEXT
,NTEXT
andIMAGE
columns you might have!The next step is: Be happy and enjoy the fast and easy going with the native XML type :-D
UPDATE Differences in environment
You write: Data between environments is different
The error happens here:
If your column would store the XML in the native type, you would not need a cast (which is very expensive!!) at all. But in your case this cast depends on the actual XML. As this is stored in
NTEXT
it is 2-byte-string. If your XML starts with a declaration stating a non-supported encoding (in most casesutf-8
), this will fail.Try this:
This works
This fails
Play around with
VARCHAR
andNVARCHAR
andutf-8
andutf-16
...