Environment: SQL Server 2012. Primary and secondary (value) index is built on xml column.
Say I have a table Message with xml column WordIndex. I also have a table Word which has WordId and WordText. Xml for Message.WordIndex has the following schema:
<xs:schema attributeFormDefault="unqualified"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.com">
<xs:element name="wi">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="w">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="p" type="xs:unsignedByte" />
</xs:sequence>
<xs:attribute name="wid" type="xs:unsignedByte" use="required" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
and some data to go with it:
<wi xmlns="http://www.example.com">
<w wid="1">
<p>28</p>
<p>72</p>
<p>125</p>
</w>
<w wid="4">
<p>89</p>
</w>
<w wid="5">
<p>11</p>
</w>
</wi>
I need to search for multiple values in my xml column WordIndex either using OR or AND. What I'm doing is fairly rudimentary, since I'm a n00b in XQuery (taken from debug output, hence real values):
with xmlnamespaces(default 'http://www.example.com')
select
m.Subject,
m.MessageId,
m.WordIndex.query('
let $dummy := 0
return
<word_list>
{
for $w in /wi/w
where $w/@wid=64
return <word wid="64" pos="{data($w/p)}"/>
}
{
for $w in /wi/w
where $w/@wid=70
return <word wid="70" pos="{data($w/p)}"/>
}
{
for $w in /wi/w
where $w/@wid=63
return <word wid="63" pos="{data($w/p)}"/>
}
</word_list>
') as WordPosition
from
Message as m
-- more joins go here ...
where
-- more conditions go here ...
and m.WordIndex.exist('/wi/w[@wid=64]') = 1
and m.WordIndex.exist('/wi/w[@wid=70]') = 1
and m.WordIndex.exist('/wi/w[@wid=63]') = 1
How can this be optimized?