Training solr to recognize nicknames or name varia

2019-04-02 08:55发布

I'm pretty sure that solr can be set up to recognize synonyms during searches. I'm wondering if it's possible to do the same with nicknames -- so for example searches for "Robert" would pull up records with "Bob" in them.

1条回答
我命由我不由天
2楼-- · 2019-04-02 09:31

Just found a page where someone named Jon Moniaci exactly how to do this: http://bitsandpieces.jonmoniaci.com/2010/05/searching-common-nicknames-in-solr/

Basically, create a synonyms file with lines like so:

Bob, Robert, Bobby

(Jon's file is here, derived from the listing of common male and female nicknames on http://usefulenglish.ru/)

Save to english_names.txt and add the following to your solr configuration:

<fieldType name="textEnglishName" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="english_names.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
</fieldType>

Then designate the author field as a textEnglishName field:

<fields>
  <field name="name" type="textEnglishName" indexed="true" stored="false"/>
</fields>
查看更多
登录 后发表回答