Solr suggest - How to define solr suggest as case

2019-05-07 14:05发布

问题:

My suggest (spellchecker) is returning case sensitive answers. (I use it to autocomplete - dog and Dog return different phrases)\

my suggest is defined as follows - in solrconfig -

 <searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
    <str name="name">suggest</str>
    <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
    <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
    <str name="field">suggest</str>  <!-- the indexed field to derive suggestions from -->
    <float name="threshold">0.005</float>
    <str name="buildOnCommit">true</str>
    <!--<str name="sourceLocation">american-english</str>-->
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
    <lst name="defaults">
        <str name="spellcheck">true</str>
        <str name="spellcheck.dictionary">suggest</str>
        <str name="spellcheck.onlyMorePopular">true</str>
        <str name="spellcheck.count">5</str>
        <str name="spellcheck.collate">true</str>
    </lst>
    <arr name="components">
        <str>suggest</str>
    </arr>
</requestHandler>

in schema

<field name="suggest" type="phrase_suggest" indexed="true" stored="true" required="false" multiValued="true"/>  

and

<copyField source="Name" dest="suggest"/>

and

<fieldtype name="phrase_suggest" class="solr.TextField">
  <analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.PatternReplaceFilterFactory"
            pattern="([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+"
            replacement=" " replace="all"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.TrimFilterFactory"/>
  </analyzer>
</fieldtype>

回答1:

For this to work you need to add the field type to the search component declaration in solrconfig.xml In this case "phrase_suggestion" but match to whatever fieldtype you have created in schema.xml that has the lowercasefilterfactory declared.

<searchComponent class="solr.SpellCheckComponent" name="suggest">
    <lst name="spellchecker">
        <str name="name">suggest</str>
        <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
        <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
        <str name="field">suggest</str>  <!-- the indexed field to derive suggestions from -->
        <float name="threshold">0.005</float>
        <str name="buildOnCommit">true</str>

        <!-- THIS IS THE LINE TO ADD -->
        <str name="suggestAnalyzerFieldType">phrase_suggest</str>

    </lst>
</searchComponent>


回答2:

Actually, the correct configparamter is "queryAnalyzerFieldType" and is has to go outside the list element, like so:

<searchComponent class="solr.SpellCheckComponent" name="suggest">
    <lst name="spellchecker">
        <str name="name">suggest</str>
        <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
        <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
        <str name="field">suggest</str>  <!-- the indexed field to derive suggestions from -->
        <float name="threshold">0.005</float>
        <str name="buildOnCommit">true</str>

    </lst>
    <!-- Make it case-insensitive -->
    <str name="queryAnalyzerFieldType">text_general</str>
</searchComponent>

This works for spelling correction as well as suggestions.



回答3:

Try to change the order of filter factories which are added into the fieldType. Also, place LowerCaseFilterFactory at the top of the list.

Shishir