How to search fields with wildcard and spaces in H

2019-03-27 18:14发布

问题:

I have a search box that performs a search on title field based on the given input, so the user has recommended all available titles starting with the text inserted.It is based on Lucene and Hibernate Search. It works fine until space is entered. Then the result disapear. For example, I want "Learning H" to give me "Learning Hibernate" as the result. However, this doesn't happen. could you please advice me what should I use here instead.

Query Builder:

QueryBuilder qBuilder = fullTextSession.getSearchFactory()
        .buildQueryBuilder().forEntity(LearningGoal.class).get();
  Query query = qBuilder.keyword().wildcard().onField("title")
        .matching(searchString + "*").createQuery();

  BooleanQuery bQuery = new BooleanQuery();
  bQuery.add(query, BooleanClause.Occur.MUST);
  for (LearningGoal exGoal : existingGoals) {
     Term omittedTerm = new Term("id", String.valueOf(exGoal.getId()));
     bQuery.add(new TermQuery(omittedTerm), BooleanClause.Occur.MUST_NOT);
  }
  @SuppressWarnings("unused")
  org.hibernate.Query hibQuery = fullTextSession.createFullTextQuery(
        query, LearningGoal.class);

Hibernate class:

@AnalyzerDef(name = "searchtokenanalyzer",tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
  @TokenFilterDef(factory = StandardFilterFactory.class),
  @TokenFilterDef(factory = LowerCaseFilterFactory.class),
  @TokenFilterDef(factory = StopFilterFactory.class,params = { 
      @Parameter(name = "ignoreCase", value = "true") }) })
      @Analyzer(definition = "searchtokenanalyzer")
public class LearningGoal extends Node {

回答1:

I found workaround for this problem. The idea is to tokenize input string and remove stop words. For the last token I created a query using keyword wildcard, and for the all previous words I created a TermQuery. Here is the full code

    BooleanQuery bQuery = new BooleanQuery();
    Session session = persistence.currentManager();
    FullTextSession fullTextSession = Search.getFullTextSession(session);
    Analyzer analyzer = fullTextSession.getSearchFactory().getAnalyzer("searchtokenanalyzer");
    QueryParser parser = new QueryParser(Version.LUCENE_35, "title", analyzer);
    String[] tokenized=null;
    try {
    Query query=    parser.parse(searchString);
    String cleanedText=query.toString("title");
     tokenized = cleanedText.split("\\s");

    } catch (ParseException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

    QueryBuilder qBuilder = fullTextSession.getSearchFactory()
            .buildQueryBuilder().forEntity(LearningGoal.class).get();
    for(int i=0;i<tokenized.length;i++){
         if(i==(tokenized.length-1)){
            Query query = qBuilder.keyword().wildcard().onField("title")
                    .matching(tokenized[i] + "*").createQuery();
                bQuery.add(query, BooleanClause.Occur.MUST);
        }else{
            Term exactTerm = new Term("title", tokenized[i]);
            bQuery.add(new TermQuery(exactTerm), BooleanClause.Occur.MUST);
        }
    }
        for (LearningGoal exGoal : existingGoals) {
        Term omittedTerm = new Term("id", String.valueOf(exGoal.getId()));
        bQuery.add(new TermQuery(omittedTerm), BooleanClause.Occur.MUST_NOT);
    }
    org.hibernate.Query hibQuery = fullTextSession.createFullTextQuery(
            bQuery, LearningGoal.class);


回答2:

SQL uses different wildcards than any terminal. In SQL '%' replaces zero or more occurrences of any character (in the terminal you use '*' instead), and the underscore '_' replaces exactly one character (in the terminal you use '?' instead). Hibernate doesn't translate the wildcard characters.

So in the second line you have to replace matching(searchString + "*") with

  matching(searchString + "%")