I'm currently using the highlighting feature that elasticsearch offers in my query. However, the one thing I'm not quite clear on is about how the results are ordered. I would prefer they come back in the order that they appear
in a paragraph instead of importance/score. This is so I can concatenate them with ...
's in the same order as they are in the original document (similar to Google results). However, they are currently returning in some weighted order based on best match?
Is there a way to accomplish this without having to do additional post processing on the field after seeing the highlight results.
I see there is a "order" : "score"
option for a highlight, but there doesn't seem to be any other documented options to change the return order. (And as an aside, I don't understand the difference between the default order and the scoring order).
Here's a snippet of the highlight portion of my query.
"highlight": {
"fields": {
"synopsis": {
"fragment_size": 150,
"number_of_fragments": 4
}
}
}
So after doing a bit of playing around, I discovered that the
fast-vector-highlighter
will natively sort the fragments in order of appearance in the original document. To enable this, I needed to add"term_vector" : "with_positions_offsets"
to my synopsis field mapping.and then use my highlight query as so:
NOTE: Using
"order" : "score"
would cause the ordering to follow the the scoring schema, which does not necessarily follow start position offset order. I believe the exact code for this comparator can be found here, which seems to base it on the fragment's boost and then its startoffset.