Wrong spell-check suggestions by Solr

2019-01-29 06:49发布

Working on Spell Suggest with Solr 4.1.

We configured it correctly and Solr offers term as well as collate suggestions. However, we noticed that many times the suggested word / collate doesn't have any results if we search it again.

For example, we searched for term "confort" and got no results, with two suggestions "comfort" and "convert". The first term contains the result.. however the second term doesn't bring any result, and instead suggested two more terms, so term "convert" offers no result with following suggestions - "connect" and "content". Here also, we found that "connect" is having few results but "content" doesn't have any and offered following suggestions.. i.e. "connect" and "continent". Here also we found that "continent" doesn't have any results and it suggested "connect".

The same happens for many search terms and even collate. We're clueless what is causing this? Can we turn off such suggestions which doesn't carry any result?

My Solr Config

<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="df">Name</str>
      <str name="spellcheck.dictionary">default</str>
      <str name="spellcheck.dictionary">wordbreak</str>
      <str name="spellcheck">on</str>
      <str name="spellcheck.extendedResults">true</str>       
      <str name="spellcheck.count">10</str>
      <str name="spellcheck.alternativeTermCount">5</str>
      <str name="spellcheck.maxResultsForSuggest">5</str>       
      <str name="spellcheck.collate">true</str>
      <str name="spellcheck.collateExtendedResults">true</str>  
      <str name="spellcheck.maxCollationTries">10</str>
      <str name="spellcheck.maxCollations">5</str>         
    </lst>
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
</requestHandler>

<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">text</str>
<lst name="spellchecker">
  <str name="name">default</str>
  <str name="field">Name</str>
  <str name="classname">solr.DirectSolrSpellChecker</str>
  <str name="distanceMeasure">internal</str>
  <float name="accuracy">0.5</float>
  <int name="maxEdits">2</int>
  <int name="minPrefix">1</int>
  <int name="maxInspections">5</int>
  <int name="minQueryLength">4</int>
  <float name="maxQueryFrequency">0.01</float>
</lst>

<lst name="spellchecker">
  <str name="name">wordbreak</str>
  <str name="classname">solr.WordBreakSolrSpellChecker</str>      
  <str name="field">Name</str>
  <str name="combineWords">true</str>
  <str name="breakWords">false</str>
  <int name="maxChanges">10</int>     
</lst>
</searchComponent> 

My Schema :

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.LowerCaseFilterFactory"/>   
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

<field name="Name" type="text" indexed="true" stored="true"  required="false" />

My Query : http://localhost:8983/solr/mycore/spell?q=confort&spellcheck=true&Collate=true&spellcheck.extendedResults=true

Result :

<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">16</int>
  </lst>
  <result name="response" numFound="0" start="0"/>
  <lst name="spellcheck">
    <lst name="suggestions">
      <lst name="confort">
        <int name="numFound">2</int>
        <int name="startOffset">0</int>
        <int name="endOffset">7</int>
        <int name="origFreq">0</int>
        <arr name="suggestion">
          <lst>
            <str name="word">comfort</str>
            <int name="freq">6</int>
          </lst>
          <lst>
            <str name="word">convert</str>
            <int name="freq">2</int>
          </lst>
        </arr>
      </lst>
      <bool name="correctlySpelled">false</bool>
    </lst></lst>
  </response>

2条回答
Anthone
2楼-- · 2019-01-29 07:10

Are the terms you search on and the spell check enabled on the same ? do they go under the same analysis ?
One reason can be the fields are different and hence the suggestions on field provided do not exist in the fields that are being searched for.
Also, it can be the fields are analysed differently and hence the spell suggestion and the search does not match.

查看更多
聊天终结者
3楼-- · 2019-01-29 07:10

You said in the comment, you are getting suggestion from index, but your configuration was not.

<str name="classname">solr.DirectSolrSpellChecker</str>

change the above to this

<str name="classname">solr.IndexBasedSpellChecker</str>
查看更多
登录 后发表回答