I use WhitespaceAnalyzer
as query analyzer.
If I have 2 documents:
| text | a b c |
| text | b a c |
text
is a field.
Now the index structure is something like this:
|Term| in document |
| a | a b c / b a c|
| b | a b c / b a c|
| c | a b c / b a c|
And I have a query:
| text | a b c |
How can I get a higher score for a b c
and a lower one for b a c
.
Does Lucene support calculating score depending on relative position?
I found that I found this would help:
PhraseQuery phraseQuery = new PhraseQuery();
phraseQuery.setSlop(1);
In this way they would get different scores.
See more: http://www.blogjava.net/tangzurui/archive/2008/09/22/230357.html
And here I come across another question:
https://stackoverflow.com/questions/18394532/how-can-lucenes-scoring-depend-on-terms-relative-position-in-the-document
It depends on, which type of query you use. Some query could get more score, if phrase that you search is placed in correct order (e.g. new york or york new). According to Lucene documentation, you could use explanation of score, to see, why A B C is geting higher score than B A C.
http://lucene.apache.org/core/3_6_2/scoring.html
UPD. For storing position of terms look at this, if you using Lucene 3 http://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/document/Field.TermVector.html
The score contribution of a phrase match depends on the distance:
For your case query "a b c" will match with document "a b c" with distance 0. This will result to highest phrase score. For document "b a c" distance will be more than 0. So Score will be less.
For more details look at source code of org.apache.lucene.search.SloppyPhraseScorer Class.