How to perform a wildcard search in Lucene

2019-04-14 03:26发布

I know that Lucene has extensive support for wildcard searches and I know you can search for things like:

Stackover* (which will return Stackoverflow)

That said, my users aren't interested in learning a query syntax. Can Lucene perform this type of wildcard search using an out-of-box Analyzer? Or should I append "*" to every search query?

3条回答
男人必须洒脱
2楼-- · 2019-04-14 03:42

If you are considering turning every query into a wildcard, I would ask myself these questions:

  1. Is Lucene the best tool for the job? by default wildcards rewrite to constant-score queries, which means you are throwing away relevance ranking completely and no longer "searching" but instead "matching". Perhaps for your application a search engine library is not the best solution and another tool (e.g. database) would be better.
  2. If the answer to #1 is still 'yes', then I would recommend taking a look at what the exact relevance problem is that you are trying to solve. For example, if its that you want queries to match compound or stemmed words, maybe instead add a decompounder or stemmer to your analysis chain instead. You can also consider using an n-gram indexing technique as another alternative.
查看更多
神经病院院长
3楼-- · 2019-04-14 03:48

Doing this with string manipulations is tricky to get right, especially since the QueryParser supports boosting, phrases, etc.

You could use a QueryVisitor that rewrites TermQuery into PrefixQuery.

public class PrefixRewriter : QueryVisitor {
    protected override Query VisitTermQuery(TermQuery query) {
        var term = query.GetTerm();
        var newQuery = new PrefixQuery(term);
        return CopyBoost(query, newQuery);
    }
}

The QueryVisitor class can be found at A QueryVisitor for Lucene.

Update a few years later:

The blog post is 404 since long time ago, but the source still lives! It can nowadays be found on github.

查看更多
狗以群分
4楼-- · 2019-04-14 03:54

If I want to do something like that I normally format the term before searching e.g.

searchTerm = QueryParser.EscapesearchTerm);
if(!searchTerm.EndsWith(" "))
{
    searchTerm = string.Format("{0}*", searchTerm);
}

which will escape any special characters people have put in. and if the term doesnt ends with a space appends a * on the end. Since * on its own would cause a parsing exception.

查看更多
登录 后发表回答