Why Lucene doesn't support any type of update

2020-03-15 05:25发布

My use case involves index a Lucene document, then on multiple future occasions add terms that point to this existing doc, that's without deleting and re-adding the entire document for each new term (because of performance, and not keeping the original terms).

I do know that a document can not be truly updated. My question is why?

Or more precisely, why are all forms of updates (terms, stored fields) not supported?
Why it's not possible to add another term to point to an existing document - technically: isn't all that's needed is to have the existing doc Id placed in the posting list of the term. Why is that hard? Is there some immutable statistics that are in the way?

Are there any workarounds for supporting my usecase of adding a term (indexed field) to an existing doc?

1条回答
我命由我不由天
2楼-- · 2020-03-15 06:02

I do know that a document can not be truly updated. My question is why?

Gili, editing a document will cause changes in the related terms postings and this is problematic due to to the terms posting-list structure. The posting-list is sorted and stored sequential in memory. Thus to add a document to a term's posting-list you have to give it a higher doc id this is done by deleting and re-index the entire document.

查看更多
登录 后发表回答