Special characters (-&+, etc) not working in SOLR

2019-06-26 05:35发布

问题:

I'm using "text_general" fieldType for searching in SOLR. While searching with special characters I'm not getting proper results and getting errors. I would like to use special characters like these:

  1. -
  2. &
  3. +

Query

  1. solr?q=Healing - Live

  2. solr?q=Healing & Live

Error message

The request sent by the client was syntactically incorrect (org.apache.lucene.queryParser.ParseException: Cannot parse '("Healing \': Lexical error at line 1, column 8. Encountered: after : "\"Healing \").

schema.xml

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>               
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.ASCIIFoldingFilterFactory" />
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.ASCIIFoldingFilterFactory" />
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>


<field name="title" type="text_general" indexed="true" stored="true" />

<field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/>

<defaultSearchField>text</defaultSearchField>

<copyField source="title" dest="text"/>

回答1:

You need to escape your query since the dash is a special character in lucene queries. Have a look at the other characters that you should escape here, and here if you want to know more about the lucene query syntax.

Your query would then look like this: solr?q=Healing \- Live

I don't know which language you are writing code with but if you are using Java solrj provides the ClientUtils#escapeQueryChars method.



回答2:

On Solr search based on Solarium:

app\code\local\Module\Solarium\controllers\AjaxController.php

function suggestAction()
{

    //get comp from http://[MAGE-ROOT]/solarium/ajax/suggest/?q=comp
    $comp = $this->getRequest()->getParam('q',false);

    //remove special characters
    $special_characters = array('(',')','/','\','&','!','.','-','+');
    $comp = str_replace($special_characters,'',$comp);

    //save q param
    $this->getRequest()->setParam('q',$comp);

    //existing code
    ...............

}


回答3:

StandardTokenizerFactory is the problem you should use WhitespaceTokenizerFactory. This worked for me.



回答4:

Why don't you use AND OR NOT instead of those special characters.

For example:

Healing NOT Live
Healing AND Live
Healing OR Live


标签: solr