Solr wildcard searching

2020-04-05 07:38发布

问题:

If I have a record with keywords Chris Muench, I want to be able to match Mue or Chr. How can I do this with a solr query. Currently I do the following:

$results = $solr->search('"'.Apache_Solr_Service::escape($_GET['textsearch']).'"~100', 0, 100, array('fq' => 'type:datacollection'));

It doesn't match Mue or Chr, but it does match Muench

Schema:

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="rocdocs" version="1.4">
  <types>
    <!-- The StrField type is not analyzed, but indexed/stored verbatim. -->
    <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
    <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>
 </types>


 <fields>
    <field name="type" type="string" indexed="true" stored="true" required="true" />
    <field name="mongo_id" type="string" indexed="true" stored="true" required="true" />
    <field name="nid" type="int" indexed="true" stored="true" required="true" />
    <field name="keywords" type="text_general" indexed="true" stored="false" />
 </fields>

 <!-- Field to use to determine and enforce document uniqueness. 
      Unless this field is marked with required="false", it will be a required field
   -->
 <uniqueKey>mongo_id</uniqueKey>

 <!-- field for the QueryParser to use when an explicit fieldname is absent -->
 <defaultSearchField>keywords</defaultSearchField>
 <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
 <solrQueryParser defaultOperator="OR"/>
</schema>

回答1:

You need to either use wildcard queries e.g. chr* or mue* which would match.
This would either client to either enter the query in this format or modifying it in the application.
Else, you can generate tokens using solr.EdgeNGramFilterFactory and this would match the records. e.g. chris would generate ch, chr, chri, chris and hence would match all these combination.



标签: solr