是否有什么我失踪?
我想创建基本上是由空格(或任何类型你喜欢)分离索引的表。 我认识到,全文检索不可能仅仅对INT型数据列,因为它知道“空格”作为分隔符来分隔数据在整个目录建立索引。
我当然知道,它确实让我索引varbinary
类型的数据,但为什么不只是int
用空格分隔的数据,而不是包括整数和文本数据进行搜索。 IE浏览器,一个
SEARCH * FROM MyTable
WHERE CONTAINS(indexedcolumn, '1189')
与看起来像一个表中定义的全文索引/目录:
indexedColumn secondDelimitedIntColumn
1189 34 34209 1989 3 5
是不可能的,但
SEARCH * FROM MyTable
WHERE CONTAINS(uniqueColumn, 'a1189')
将工作使用上有以下的列的表的全文索引:
uniqueColumn secondDelimitedIntColumn
a1189 b34 b34209 b1989 b3 b5
所以基本上执行CONTAINS()
与全文索引它只会工作,任何一列的搜索,如果有附加到整数字符串一些文本。
但我的问题是问“用空格分隔的整数为什么我就不能使用的字符串,从而节省了我不必添加虚拟文本只是为了得到窍门的SQL Server步入允许我执行对索引整数字符串全文搜索?”
提前致谢!
这是不是一个真正的问题。 没有关于您尝试运行查询,或者你正在运行它的架构细节。 我不完全知道该怎么在这里告诉你。 我也许能帮助你,如果有一些可用的细节。 它更像是你有一个投诉比的问题。
我充分意识到这应该是在评论部分,并没有回答,但我没有点上溢出。 我住在.dba。
Updated with XML example, below
Your current design violates 1st normal form.
That, in itself, is okay. Over some years, I've inherited and had to maintain several systems that did so. I don't know why they were built that way. It doesn't really matter. They had to be maintained and the schedule wasn't always such that there was time for refactoring, testing and validation, not to mention doing so for the stack of apps that were built upon them.
Looking back now, though, I can easily spot the one attribute that they all shared. It was the absolute biggest barrier to optimizing and extending these systems: the underlying "relational" database violated 1st normal form. Virtually every technical "gotcha" encountered, virtually every performance problem, it was the root cause. Splitting strings. Creating a faux datatype system to validate them. Creating further delimited attributes to describe them. Creating special rules for each delimited "location" and having to implement an EVAL function in many systems to enforce them. Using dynamic SQL or worse to search it all. It took more "clever" programming to implement what seemed like conceptually simple features than I care to recollect.
Maybe your system is different. Maybe 40+ years of relational database research does not apply to your situation. For your sake, I truly hope so. The only problem is that you're using a relational database in a non-relational way. Just like you can pound screws with a hammer, and you can pull a boat with a motorcycle (don't hit the brakes if you actually get it going), you can create an index (full-text or b-tree) on text that represents integers.
But why would you do any of these things? Why wouldn't you actually store the integers as integers and enjoy type-safety? Why wouldn't you normalize this into two related tables to take advantage of smaller transactions and more indexing options? If you've inherited a system that you can't change, then please say so and people might be able to help with alternatives (TVPs and XML been rightfully mentioned). But I can't see coming into the situation saying that your hammer and motorcycle are broken because they don't drive screws and pull boats very well.
All that said (maybe somebody, somewhere is rethinking an ill-advised design), I've put LIKE
to good use when searching delimited strings:
-- Setup demo data
declare @delimitedInts table (
data varchar(max) not null
)
insert into @delimitedInts select '0,1,2'
insert into @delimitedInts select '1,2,3,4'
insert into @delimitedInts select '5,10'
-- Create a search term
declare @searchTerm int = 2
-- Get all rows that contain the searchTerm
select data
from @delimitedInts
where ',' + data + ',' like '%,' + cast(@searchTerm as varchar(11)) + ',%'
-- Create many search terms
declare @searchTerms table (
searchTerm int not null primary key
)
insert into @searchTerms select 2
insert into @searchTerms select 3
insert into @searchTerms select 4
-- Get all rows that contain ANY of the searchTerms
select distinct a.data
from @delimitedInts a
join @searchTerms b on ',' + a.data + ',' like '%,' + cast(b.searchTerm as varchar(11)) + ',%'
-- Get all rows that contain ALL of the searchTerms
select a.data
from @delimitedInts a
join @searchTerms b on ',' + a.data + ',' like '%,' + cast(b.searchTerm as varchar(11)) + ',%'
group by a.data
having count(*) = (select count(*) from @searchTerms)
Is this too slow for you? Maybe. Have you actually measured it? At least you could get an implementation in place and prove that it works before you optimize it.
Update: XML
I've done a little testing on converting your space-delimited column to an XML column and querying it, including doing so with XML indexes. Unfortunately, you can't put an XML index on a computed column, so I'm using a trigger to keep an XML column automatically updated. Here are some interesting results (note the SQL comments):
-- Create a demo table
create table MyTable (
ID int not null primary key identity
, SpaceSeparatedInts varchar(max) not null
--, ComputedIntsXml as cast('<ints><i>' + replace(SpaceSeparatedInts, ' ', '</i><i>') + '</i></ints>' as xml) persisted -- Can't use XML index
, IntsXml xml null
)
go
-- Create trigger to update IntsXml
create trigger MyTable_Trigger on MyTable after insert, update as begin
update m
set m.IntsXml = cast('<ints><i>' + replace(m.SpaceSeparatedInts, ' ', '</i><i>') + '</i></ints>' as xml)
from MyTable m
join inserted i on m.ID = i.ID
end
go
-- Add some demo data
insert into MyTable (SpaceSeparatedInts) select '1'
insert into MyTable (SpaceSeparatedInts) select '1 2'
insert into MyTable (SpaceSeparatedInts) select '2 3 4'
insert into MyTable (SpaceSeparatedInts) select '5 6 7 10'
insert into MyTable (SpaceSeparatedInts) select '100 10 1000'
go
-- Search for the number 10 (and use this same query in subsequent testing, below)
select *
from MyTable
where IntsXml.exist('/ints/i[. = "10"]') = 1
-- This query spends virtually all of its time running an XML Reader and an XPath filter
-- Add a primary xml index
create primary xml index IX_MyTable_IntsXml on MyTable (IntsXml)
-- The query now uses a clustered index scan and clustered index seek on PrimaryXML
-- Add secondary xml index for value
create xml index IX_MyTable_IntsXml_Value on MyTable (IntsXml) using xml index IX_MyTable_IntsXml for value
-- No change
-- Add secondary xml index for path
create xml index IX_MyTable_IntsXml_Path on MyTable (IntsXml) using xml index IX_MyTable_IntsXml for path
-- No change
-- Add secondary xml index for property
create xml index IX_MyTable_IntsXml_Property on MyTable (IntsXml) using xml index IX_MyTable_IntsXml for property
-- The query now replaces the clustered index scan on PrimaryXML with an index seek on SecondaryXML
While it is clearly a different method, is this faster than LIKE? You have to test in your environment. Hopefully this will give you some ideas of how to do so. Please let me know how this works out for you, if it's doable in your shop.
我不能肯定我明白你在找什么要么,但如果你想多值存储在一列,你最好的选择将是使用XML。
看到这个帖子对这个概念的更多信息。
查询XML列在2005年的SQLServer
文章来源: Why is there no delimited, integer-only cataloging option in SQL Server? [closed]