In SQL Server, what is the best way to determine i

2019-01-26 04:44发布

问题:

A third party component is filling up an nvarchar column in a table with some values. Most of the time it is a human-readable string, but occassionally it is XML (in case of some inner exceptions in the 3rd party comp).

As a temporary solution (until they fix it and use string always), I would like to parse the XML data and extract the actual message.

Environment: SQL Server 2005; strings are always less than 1K in size; there could be a few thousand rows in this table.


I came across a couple of solutions, but I'm not sure if they are good enough:

  1. Invoke sp_xml_preparedocument stored proc and wrap it around TRY/CATCH block. Check for the return value/handle.
  2. Write managed code (in C#), again exception handling and see if it is a valid string.

None of these methods seem efficient. I was looking for somethig similar to ISNUMERIC(): an ISXML() function. Is there any other better way of checking the string?

回答1:

I would like to parse the XML data and extract the actual message.

Perhaps it is not necessary to check for valid XML. You could check for the presence of the appropriate xml tag with charindex in a case statement and extract the error message using substring.

Here is a sample with a simplified XML string but I think you get the idea.

declare @T table(ID int, Col1 nvarchar(1000))

insert into @T values
(1, 'No xml value 1'),
(2, 'No xml value 2'),
(3, '<root><item>Text value in xml</item></root>')

select
  case when charindex('<item>', Col1) = 0
  then Col1
  else
    substring(Col1, charindex('<item>', Col1)+6, charindex('</item>', Col1)-charindex('<item>', Col1)-6)
  end  
from @T

Result:

No xml value 1
No xml value 2
Text value in xml


回答2:

You could create an XML schema and use it to validate against the XML strings.

See here for additional info : http://msdn.microsoft.com/en-us/library/ms176009.aspx

Here's an example:

CREATE XML SCHEMA COLLECTION UserSchemaCollection AS 
N'<?xml version="1.0" encoding="UTF-16"?>
  <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:element name = "User" >
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name = "UserID" />
                <xsd:element name = "UserName" />
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
  </xsd:schema>';


DECLARE @x XML(UserSchemaCollection)
SELECT @x = '<User><UserID>1234</UserID><UserName>Sebastian</UserName></User>'

Examples:

DECLARE @y XML(UserSchemaCollection)
SELECT @y = '<User><UserName>Sebastian</UserName></User>'

Msg 6965, Level 16, State 1, Line 2 XML Validation: Invalid content. Expected element(s):UserID where element 'UserName' was specified. Location: /:User[1]/:UserName[1]

DECLARE @z XML(UserSchemaCollection)
SELECT @z = 'Some text'

Msg 6909, Level 16, State 1, Line 2 XML Validation: Text node is not allowed at this location, the type was defined with element only content or with simple content. Location: /



回答3:

Based on the accepted answer, I created this to check for valid XML and optionally convert the input string to XML (or extract the required elements / attributes from the XML), because I figured out that TRY_CONVERT works succesfully in case you just pass in plain text, what I did not expect, so another check was required to prevent the final cast to XML I need to work in case the source column just holds some text (sample row 1):

declare @T table(ID int, Col1 nvarchar(1000))
insert into @T values
(1, 'random text value 1'),
(2, '<broken> or invalid xml value 2'),
(3, '<root><item>valid xml</item></root>')
select id, Col1,
 Converted_XML = CASE 
        when [Col1] IS NULL THEN NULL                   /* NULL stays NULL */ 
        when TRY_CONVERT(xml, [Col1]) is null THEN NULL /* Xml Document Error */
        when CHARINDEX('<', [Col1]) < 1 AND CHARINDEX('>', [Col1]) < 1 THEN NULL        /* no xml */
        else CONVERT(xml, [Col1])                       /* Parsing succesful. => in this case you can convert string to XML and/or extract the values */
    END,
    Result_Comment = CASE 
        when [Col1] IS NULL THEN 'NULL always stays NULL'
        when TRY_CONVERT(xml, [Col1]) is null THEN 'Xml Document Error'
        when CHARINDEX('<', [Col1]) < 1 AND CHARINDEX('>', [Col1]) < 1 THEN 'no xml'
        else [Col1]
    END
FROM @T ;


回答4:

I don't know about a best way, but here's a way:

DECLARE @table TABLE (myXML XML)

INSERT INTO @table
SELECT  
'
    <Employee>
        <FirstName>Henry</FirstName>
        <LastName>Ford</LastName>
    </Employee>
'

SELECT myXML 
FROM @table 
FOR XML RAW

If the XML is invalid it will throw an error:

DECLARE @table TABLE (myXML XML)

INSERT INTO @table
SELECT  
'
    <Employee
        <FirstName>Henry</FirstName>
        <LastName>Ford</LastName>
    </Employee>
'

SELECT myXML 
FROM @table 
FOR XML RAW

Just for clarification, all you have to do is cast it:

BEGIN TRY
    DECLARE @myXML XML
    SET @myXML = CAST
    ('
        <Employee>
            <FirstName>Henry</FirstName>
            <LastName>Ford</LastName>
        </Employee>
    ' AS XML)
    SELECT 'VALID XML'
END TRY
BEGIN CATCH
    SELECT 'INVALID XML'
END CATCH;

vs

BEGIN TRY
    DECLARE @myXML XML
    SET @myXML = CAST
    ('
        <Employee
            <FirstName>Henry</FirstName>
            <LastName>Ford</LastName>
        </Employee>
    ' AS XML)
    SELECT 'VALID XML'
END TRY
BEGIN CATCH
    SELECT 'INVALID XML'
END CATCH;