I know that elasticsearch takes in account the length of a field when computing the score of the documents retrieved by a query. The shorter the field, the higher the weight (see The field-length norm).
I like this behaviour: when I search for iphone
I am much more interested in iphone 6
than in Crappy accessories for: iphone 5 iphone 5s iphone 6
.
Now, I would like to try to boost this stuff, let's say that I want to double its importance.
I know that one can modify the score using the function score, and I guess that I can achieve what I want via script score.
I tried to add another field-length norm to the score like this:
{
"query": {
"function_score": {
"boost_mode": "replace",
"query": {...},
"script_score": {
"script": "_score + norm(doc)"
}
}
}
}
But I failed badly, getting this error: [No parser for element [function_score]]
EDIT:
My first error was that I hadn't wrapped the function score in a "query". Now I edited the code above. My new error says
GroovyScriptExecutionException[MissingMethodException
[No signature of method: Script5.norm() is applicable for argument types:
(org.elasticsearch.search.lookup.DocLookup) values:
[<org.elasticsearch.search.lookup.DocLookup@2c935f6f>]
Possible solutions: notify(), wait(), run(), run(), dump(), any()]]
EDIT: I provided a first answer, but I'm hoping for a better one
It looks like you could achieve that using a field of type
token_count
together with afield_value_factor
function score.So, something like this in the field mapping:
This will use the number of tokens in the field. If you want to use the number of characters, you can change the analyzer from
standard
to a custom one that tokenizes each character.Then in the query:
I have something that kind of works. With the following, I deduct the length of a field of my interest from the score.
Hovever, I cannot control the relative weight of this number I am subtracting, compared to the old score. That's why I am not accepting my answer: I'll wait for better ones for a while. Ideally, I'd love to have a way to access the field length norm function within the
script_score
, or to get an equivalent result.