Sunspot — Boost records where matches occur early

2019-06-23 23:11发布

For example, let's say there is a record in my DB that has the text "Hormel Corporation" and my search term is something like "Hormel Corned Beef 16 Ounces". As my current configuration stands, the top results will be other records, even though "Hormel Corporation" is the one I'm looking for. I think the solution to my problem would be to give priority to records where a match comes earliest in the search term. I've read all the docs, but I have had trouble figuring out how this might work.

I only have one field -- name. That name field for the record I want reads "Hormel Corporation", however when I search the "Hormel Corned Beef 16 Ounces", the top result is something that ISNT "Hormel Corporation," but something seemingly random, while the record I'm looking for is 3rd or 4th in the results.

Thanks a lot!

2条回答
等我变得足够好
2楼-- · 2019-06-23 23:43

I had a similar problem to solve. So I stored my data in many fields:

title
keywords (upto 10 words)
abstract (a paragraph)
text (as long as you like)

For querying, I used the dismax query parser over the fields with different weights:

title^20
keywords^20
abstract^12
text^1

So if you

  1. define your data schema well
  2. use dismax
  3. determine per-field weights for your queries

when you search "Hormel Corned Beef 16 Ounces", a result whose title is "Hormel Corp" will score better a document whose body contains "...For the dish, we reccomend a can of Hormel Corned Beef 16 Ounces..."


Edit on OP's comments.

OP's fact is: given a title of n words, the first n words matter more than the rest.

I suggest a data model in which there are two fields: title_first_words and title. The client application (sorry, you cannot directly use DIH) will have to extract the first n words from title to store into title_first_words and the full title is stored to title.

For searching, you can give the entire query to the dismax parser. The query parser is theb biased to title_first_words like title_first_words^4 title^1. Thus the first n words will make a bigger impact for a given search.

查看更多
Fickle 薄情
3楼-- · 2019-06-23 23:56

Have you tried to boost importance of each word in search term like:

Hormel^100 Corned^20 Beef^5 16^2 Ounces^1
查看更多
登录 后发表回答