Boost evenly across field of varying length

2019-05-28 13:03发布

问题:

I've got a text field that can potentially have multiple values.

doc 1: field a:"X Y"

doc 2: field a:"X"

I want to be able to do :

a:X^5

And have both doc 1 and 2 get an identical score. I've been messing around with all the field options, but I always end up with doc 2 getting double the score of doc 1.

I've tried setting multiValued="true", but get the same result.

Is there someway that I can set my search or the field definition so that it will boost just based upon the existence of the search term and not be effected by the rest of the field's contents.

回答1:

Disable norms by setting omitNorms=true in your schema and reindex - it should disable the length normalization for the field and give you the desired results.

For more details of what omitNorms does, see this.



回答2:

The field a of doc 2 has only one term as compared to doc 1 which has two.

Solr DefaultSimilartiy implementation takes into account the length norm, number of terms in the field, for the fields when calculating the score.

LenghtNorm is 1.0 / Math.sqrt(numTerms)

LengthNorm allows you to make shorter documents score higher.

You can provide your own implementation of Similarity class which doesn't take into account the lengthNorm.
Check computeNorm method implementation.

You can turn of the Norms using omitNorms=false.
Norms allow for index time boosts and field length normalization. This allows you to add boosts to fields at index time and makes shorter documents score higher.
So you would lose both of the above if you use it.



标签: solr