I have an issue with my solr settings. its NOT searching for "canaDa" in select handler as it is for "canada".
here is the schema for fieldtype text_en_splitting
(they all are important):
<fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" preserveOriginal="1" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
.
Here is the solrconfig settings for select handler:
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">20</int>
<str name="df">text</str>
<str name="defType">edismax</str>
<str name="qf">court_id^0.1 jurisdiction^1.0 jur_code^0.5 court_name^1.5 court_code^0.5 court_type^1.0</str>
<str name="mm">80%</str>
<str name="q.alt">*:*</str>
<str name="fl">*</str>
</lst>
.
Here is the Query Analysis tool of solr admin: .
As you can see, the Query Analysis did break it for "canaDa", but the search cant find it...
The behavior you are seeing here is correct based on the way that the
text_en_splitting
fieldType is configured. With this configuration the only way that "canaDa" is going to match is if the indexed term is also "canaDa", b/c that way they will both be split into "cana" and "da". If you want "canaDa" to match "canada" then I would suggest you remove thesplitOnCaseChange=1
option in theWordDelimiterFilterFactory
as this is what is causing the issue here.If removing the
splitOnCaseChange
setting is not an option, can you explain your requirements and expected behavior in more detail in the question so we can help you find a workable solution.