Hi I'm stuck with an issue:
I have a field splited_data
and field type text_split
(in my schema.xml
):
<field name="splited_data" type="text_split" indexed="true" stored="false" /> <fieldType name="text_split" class="solr.TextField" autoGeneratePhraseQueries="true" omitNorms="true"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1" preserveOriginal="1" splitOnNumerics="1" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.KStemFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.KStemFilterFactory" /> </analyzer> </fieldType>
Now when I'm indexing this field splited_data with value "Layer Hybrid Case Black iPhone 5C"
After indexing when i'm trying different queries (with simple lucene parser) this is the result:
q=splited_data:"iphone 5c"
=> 1 result is found . Desired resultq=splited_data:"black iphone 5c"
=> no result is found. (Not desired)
This is something to do with caps in iPhone but I am not sure what. Please help. I'm using lucene 4.3. Let me know if I need to tell any other info too.
Update: I got the problem. But not sure how to handle it. The problem is position of tokens being generated from wordDelimiterFactory:
black -position: 4 iphone -position: 5 i -position:5 phone -position:6 5c -position:7
so when I'm Searching for black iphone 5c it finds black at 4 iphone at 5 and nothing to match at position 6. Ideally instead of 6 it should be matching directly position 7 for 5c. Is there anyway to specify this in phrase query?