Why Solr query not working on whitespaces?

2019-09-01 07:20发布

问题:

I am a beginner in Solr, I have the following collection in my Soleserver indexed :

{
        "id": "book5",
              "title": [
          "Five point someone"
        ],
        "author": "Chetan Bagat",
        "genere": "fantasy",
        "description": [
          "An iit guide"
        ],
        "comments": [
          "good",
          "excellent"
        ],
        "publications": [
          "swapnapublications",
          "pb publications"
        ]
      } 

and

{
        "id": "book1",
        "title": [
          "nightatcallcenter"
        ],
        "author": "ChetanBagat",
        "genere": "fiction",
        "description": [
          "Aniitguide"
        ],
        "comments": [
          "good",
          "excellent"
        ],
        "publications": [
          "bangalorepublications",
          "aswinpublications"
        ]
      }

my query q=Five +point+someone is failing

but my query

q=nightatcallcenter holds good why is it so? how can i make the first query work

My schema :

 <fields>
        <field name="id" type="text_general" indexed="true" stored="true" required="true" multiValued="false" />
        <field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/ 
            <field name="genere" type="text_general" indexed="true" stored="true"/>
            <field name="description" type="text_general" indexed="true" stored="true" multiValued="true"/>
            <field name="comments" type="text_general" indexed="true" stored="true" multiValued="true"/>
            <field name="author" type="text_general" indexed="true" stored="true" />
            <field name="publications" type="text_general" indexed="true" stored="true" multiValued="true" />
            <copyField  source='*' dest='fulltext'/>
            <field name='fulltext' type='text_general' multiValued='true '/>
 </fields>

回答1:

The problem that you have is the fact that with using text_general, you will create a single token. When you are searching for Five +point+someone, you are looking for three tokens:

  1. Five
  2. point
  3. someone

The clean solution that you can use is to create a custom text_general which will be like:

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>


回答2:

Thanks @alexf the tockenizer worked perfectly

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" > <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType>



标签: solr