Search for multiple values in an xml column

2019-08-17 04:23发布

问题:

Environment: SQL Server 2012. Primary and secondary (value) index is built on xml column.

Say I have a table Message with xml column WordIndex. I also have a table Word which has WordId and WordText. Xml for Message.WordIndex has the following schema:

<xs:schema attributeFormDefault="unqualified"
           elementFormDefault="qualified"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
           targetNamespace="http://www.example.com">
    <xs:element name="wi">
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="unbounded" name="w">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element maxOccurs="unbounded" name="p" type="xs:unsignedByte" />
                        </xs:sequence>
                        <xs:attribute name="wid" type="xs:unsignedByte" use="required" />
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

and some data to go with it:

<wi xmlns="http://www.example.com">
  <w wid="1">
    <p>28</p>
    <p>72</p>
    <p>125</p>
  </w>
  <w wid="4">
    <p>89</p>
  </w>
  <w wid="5">
    <p>11</p>
  </w>
</wi>

I need to search for multiple values in my xml column WordIndex either using OR or AND. What I'm doing is fairly rudimentary, since I'm a n00b in XQuery (taken from debug output, hence real values):

with xmlnamespaces(default 'http://www.example.com')
select
    m.Subject,
    m.MessageId,
    m.WordIndex.query('
        let $dummy := 0
        return
            <word_list>
            {
                for $w in /wi/w
                    where $w/@wid=64
                    return <word wid="64" pos="{data($w/p)}"/>
            }
            {
                for $w in /wi/w
                    where $w/@wid=70
                    return <word wid="70" pos="{data($w/p)}"/>
            }
            {
                for $w in /wi/w
                    where $w/@wid=63
                    return <word wid="63" pos="{data($w/p)}"/>
            }
            </word_list>
        ') as WordPosition
from
    Message as m
-- more joins go here ...
where
    -- more conditions go here ...
    and m.WordIndex.exist('/wi/w[@wid=64]') = 1
    and m.WordIndex.exist('/wi/w[@wid=70]') = 1
    and m.WordIndex.exist('/wi/w[@wid=63]') = 1

How can this be optimized?

回答1:

Not sure I understand what your expected results are, but you could make this a bit more generic and data-driven (eg using sql:column or sql:variable). Try something like this:

declare @wids table ( wid INT PRIMARY KEY )

insert into @wids ( wid )
values ( 64 ), ( 70 ) 

;with xmlnamespaces(default 'http://www.example.com') 
select 
    m.Subject, 
    m.MessageId, 
    m.WordIndex.query(' 
        return 
            <word_list> 
            { 
                for $w in /wi/w
                where $w/@wid = sql:column("w.wid")
                return <word wid="{$w/@wid}" pos="{data($w/p)}"/> 
            } 
            </word_list> 
        ') as WordPosition 
from 
    Message as m
    cross apply @wids w 
where m.WordIndex.exist('wi/w[@wid=sql:column("w.wid")]') = 1