Solr中使用自定义查询分析器操纵查询(Manipulating query using Custo

2019-11-04 16:57发布

我试图创造一个我利用OpenNLP库,以及一个CustomQueryParser。

我的目标是,如果我有一个查询“有多少缺陷的轮辋,在中国ABC轮胎引起的故障”

我想最终的查询是这样的“缺陷轮辋轮胎故障中国”,然后会去分析仪进行进一步的处理。

这是我QueryParserPlugin代码 -

package com.mycompany.lucene.search;
import org.apache.solr.common.params.SolrParams;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.search.QParser;
import org.apache.solr.search.QParserPlugin;
import com.mycompany.lucene.search.QueryParser;

public class QueryParserPlugin extends QParserPlugin {
@Override
  public QParser createParser(String qstr, SolrParams localParams, 
SolrParams params, SolrQueryRequest req) {
    return new QueryParser(qstr, localParams, params, req, "body_txt_str");
  }
}

而对于我的QueryParser代码 -

package com.mycompany.lucene.search;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.PhraseQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;
import org.apache.solr.common.params.SolrParams;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.search.QParser;
import org.apache.solr.search.SyntaxError;
import opennlp.tools.postag.POSModel;
import opennlp.tools.postag.POSTaggerME;
import opennlp.tools.tokenize.Tokenizer;
import opennlp.tools.tokenize.TokenizerME;
import opennlp.tools.tokenize.TokenizerModel;

public class QueryParser extends QParser {

  private String fieldName;

public QueryParser(String qstr, SolrParams localParams, SolrParams params, 
SolrQueryRequest req,
      String defaultFieldName) {

    super(qstr, localParams, params, req);

    fieldName = localParams.get("field");
    if (fieldName == null) {
      fieldName = params.get("df");
    }
  }
@Override
  public Query parse() throws SyntaxError {
    Analyzer analyzer = req.getSchema().getQueryAnalyzer(); 
    InputStream tokenModelIn = null;
    InputStream posModelIn = null;
    try {
        tokenModelIn = new FileInputStream("/Files/en-token.bin");
     } catch (FileNotFoundException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

    TokenizerModel tokenModel = null;
    try {
        tokenModel = new TokenizerModel(tokenModelIn);
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    Tokenizer tokenizer = new TokenizerME(tokenModel);
    String tokens[] = tokenizer.tokenize(qstr);

    try {
        posModelIn = new FileInputStream("/Files/en-pos-maxent.bin");
    } catch (FileNotFoundException e) {
        // TODO Auto-generated catch block
         e.printStackTrace();
    }
      // loading the parts-of-speech model from stream
    POSModel posModel = null;
    try {
         posModel = new POSModel(posModelIn);
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    // initializing the parts-of-speech tagger with model 
    POSTaggerME posTagger = new POSTaggerME(posModel);
    // Tagger tagging the tokens
    String tags[] = posTagger.tag(tokens);
    String final_query = "";
    for(int i=0;i<tokens.length;i++){
        if (tags[i]=="JJ" || tags[i]=="NNS" || tags[i]=="NN") {
            final_query = final_query + " " +tokens[i];
        }
    }
    TermQuery tq= new TermQuery(new Term(fieldName,final_query));  
    return tq; 
    }
}

然后我出口这是一个罐子,加入这些罐子我solrconfig.xml中 -

<lib dir="${solr.install.dir:../../../..}/contrib/customparser/lib" 
 regex=".*\.JAR" />
<lib dir="${solr.install.dir:../../../..}/contrib/analysis-extras/lib" 
 regex="opennlp-.*\.jar" />

但是,提示以下错误:

致:

java.lang.NoClassDefFoundError: opennlp/tools/tokenize/Tokenizer
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:541)
    at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:488)
    at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:786)
    at org.apache.solr.core.PluginBag.createPlugin(PluginBag.java:135)
    at org.apache.solr.core.PluginBag.init(PluginBag.java:271)
    at org.apache.solr.core.PluginBag.init(PluginBag.java:260)
    at org.apache.solr.core.SolrCore.<init>(SolrCore.java:957)
    ... 9 more

这是我第一次创建CustomQueryParser,能否请你帮我出。

谢谢

Answer 1:

最有可能的路径

$ {solr.install.dir:../../../ ..} /的contrib /分析,演员/ lib目录

不包含相关opennlp罐子或正则表达式是不恰当的。 这是检查的第一件事。

你必须要么“捆绑”也是你的自定义查询分析器罐子opennlp依赖关系(例如,如果你使用Maven构建项目,使用maven-组装插件,行家遮阳帘插件等)或确保opennlp具体在您的solrconfig.xml中的相关指令罐子匹配。



文章来源: Manipulating query using Custom Query Parser in Solr