This question already has an answer here:
- document length in lucene 4.0 1 answer
although this is a second time I'm posting the same question, the first one is here, but no answer, or partial answer. I've been struggling with this issue, and lost in the lucene api...
What I'm interested is, to get the document length from the Lucene. When I use searcher.explain (using bm25), I see that this feature exists, but I only need to fetch it.
I would highly appreciate an example, as I'm new to Lucene, just a point to API won't help.
One naive way to do it is to store this length in a seperate field, by using string.length()
from java, and on query time retrieve it, however, this fature already exists (otherwise bm25 won't work) hence I don't want to store something redundatly.
I would highly appreciate it if you'd give a more detailed explanation on how to achieve this with the lucene 4.0, and if you're not able to provide and answer, please do not reply just for sake of replying (as then others are not reading my post thinking that it is solved!!!!), nor don't send me pointer to api (e.g. See Similarity.computeNorm by Robert Muir) because this won't help me. I need more details, like how to use this FieldInvertState, or Similarity.computeNorm??? On query time or index time??? small fragment of code would be helpful, you have to consider that I'm not an expert here, otherwise I wouldn't be asking
thanks in advance