Read value from a complex XML structure using SQL

2019-06-25 15:56发布

问题:

I am trying to read a value in a SQL Server query out of a XML structure from a column of datatype ntext.

This is the XML structure from which I want to extract VALUE TO READ!!!:

<PrinterProcessDef xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://dev.docuware.com/settings/workflow/processdef" Id="3e62848d-040e-4f4c-a893-ed85a7b2878a" Type="PrinterProcess" ConfigId="c43792ed-1934-454b-a40f-5f4dfec933b0" Enabled="true" PCId="2837f136-028d-47ed-abdc-4103bedce1d2" Timestamp="2016-08-08T09:44:38.532415">
  <Configs>
    <Config xmlns:q1="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q1:PrinterProcessConfig" Id="c43792ed-1934-454b-a40f-5f4dfec933b0" />
    <Config xmlns:q2="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q2:RecognizeActConfig" Id="b89a6fc2-5573-4034-978a-752c6c0de4cf">
      <q2:Header DefaultRecognitionTechnology="OCR" DefaultOCRSettingsGuid="00000000-0000-0000-0000-000000000000">
      </q2:Header>
      <q2:Body>
        <q2:AnchorDefs />
        <q2:ZoneDefs />
        <q2:TableDefs />
        <q2:FaceLayouts>
        </q2:FaceLayouts>
        <q2:FaceSamples>
        </q2:FaceSamples>
        <q2:SampleDocument>
          <MetaData xmlns="http://dev.docuware.com/settings/common" FileName="Test - Editor" MimeType="application/pdf" PageCount="1" SourceAppName="C:\Windows\system32\NOTEPAD.EXE" DocumentTitle="Test - Editor" PdfCreator="DocuWare Printer" />
          <Data xmlns="http://dev.docuware.com/settings/common">!!!VALUE TO READ!!!</Data>
        </q2:SampleDocument>
      </q2:Body>
      <q2:AllPagesRequired>false</q2:AllPagesRequired>
    </Config>
    <Config xmlns:q3="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q3:RecognizeActConfig" Id="db5b195d-79e4-4804-bd38-f4fc7e8d5a8d">
    </Config>
    <Config xmlns:q4="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q4:AddOverlayActConfig" Id="023aab08-c6e3-4f08-9d26-0175d1564ef2">
      <q4:Overlays />
    </Config>
    <Config xmlns:q5="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q5:PrintActConfig" Id="4a4ec06a-8652-4777-84d2-53cb862b3328">
    </Config>
    <Config xmlns:q6="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q6:SignActConfig" Id="8c030961-e68e-4c2f-83f1-cac20f51d4d6">
    </Config>
    <Config xmlns:q7="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q7:EmailActConfig" Id="5dbd144b-5c33-407a-b638-e062f9045fb4">
    </Config>
    <Config xmlns:q8="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q8:IndexActConfig" Id="f2a70e07-d76e-4e82-9313-7c665df4c311">
    </Config>
    <Config xmlns:q10="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q10:StoreActConfig" Id="ff8aec66-608e-4dde-a4b6-de65ada39bb0">
    </Config>
    <Config xmlns:q11="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q11:NotifyUserActConfig" Id="7ffb0437-6b8c-4f5f-8f40-434f4a6d609a" />
  </Configs>
  <Activities>
  </Activities>
</PrinterProcessDef>

And this is the SQL query I used:

SELECT 
    CAST([Table].[settings] as xml)
        .value('declare namespace q2="http://dev.docuware.com/settings/workflow/processconfig";
        (/PrinterProcessDef/Configs/Config[@xsi:type="q2:RecognizeActConfig"]/q2:Body/q2:SampleDocument/Data/text())[1]',
        'varchar(max)')
FROM 
    [DB].[dbo].[Table]

All I get returned is a NULL and not hoped-for VALUE TO READ!!!.

What should I do to get the query working?

I also tried different versions without namespace declaration and others but I always get NULL.

回答1:

All your elements have namespaces defined. You need declare and specify them according to definitions

SELECT CAST([Table].[settings] as xml).value(
   'declare namespace top="http://dev.docuware.com/settings/workflow/processdef";
    declare namespace q2="http://dev.docuware.com/settings/workflow/processconfig";
    declare namespace nd="http://dev.docuware.com/settings/common";
    (/top:PrinterProcessDef/top:Configs/top:Config[@xsi:type="q2:RecognizeActConfig"]/q2:Body/q2:SampleDocument/nd:Data)[1]',  
        'varchar(max)')
FROM [DB].[dbo].[Table]


回答2:

You forgot namespaces declared with xmlns attribute. Take a look at following example:

DECLARE @xml xml = 'yourXml'

SELECT @xml.value('
declare namespace q2="http://dev.docuware.com/settings/workflow/processconfig";
declare namespace g="http://dev.docuware.com/settings/workflow/processdef";
declare namespace qd="http://dev.docuware.com/settings/common";
(//g:PrinterProcessDef/g:Configs/g:Config[@xsi:type="q2:RecognizeActConfig"]/q2:Body/q2:SampleDocument/qd:Data/text())[1]',
    'varchar(max)')


回答3:

However this XML is generated, the namespaces are quite strange... You have the same namespaces declared over and over... If I do not get this wrong, the namespaces are not really the way it should be, therefore I would ignore them:

SELECT 
    CAST([Table].[settings] as xml as xml)
        .value('(/*:PrinterProcessDef/*:Configs/*:Config[@*:type="q2:RecognizeActConfig"]/*:Body/*:SampleDocument/*:Data/text())[1]',
        'varchar(max)')
FROM 
    [DB].[dbo].[Table]

Anyway I'd advise you to declare the namespaces within a WITH XMLNAMESPACE rather than within the .value-function. If you ever need more than one value out of this you can create much better to read queries:

WITH XMLNAMESPACES(DEFAULT 'http://dev.docuware.com/settings/workflow/processdef'
                  ,'http://dev.docuware.com/settings/workflow/processconfig' AS q2
                  ,'http://dev.docuware.com/settings/common' AS nd)
SELECT 
    CAST([Table].[settings] as xml)
        .value('(/PrinterProcessDef/Configs/Config[@xsi:type="q2:RecognizeActConfig"]/q2:Body/q2:SampleDocument/nd:Data)[1]',
        'varchar(max)')

Btw: Using DEFAULT avoids a dummy namespace like top: in other answers...