how lucene use skip list in inverted index?

2019-02-14 20:02发布

In some blogs and lucene website,I know lucene use data structure "skip list" in inverted index. But I have some puzzle about it.

1:In general,skip list maybe used in memory ,but inverted index is stored in disk. So how lucene use it when search on the index? just scanning it on disk or load it to memory?

2:skip list's insert operator often use random(0,1) to decide whether insert to next level,but in luncene introdution,it seems a fixed interval in every terms,so how lucene create the "skip list" different or not?

Please correct me if I am wrong.

标签： lucene inverted-index skip-lists

2条回答

太酷不给撩

2楼-- · 2019-02-14 20:44

Lucene uses memory in a couple different ways, even though the index is persisted on a disk when the IndexReader is created for searching and for operations like sorting (field cache):

http://blog.mikemccandless.com/2010/07/lucenes-ram-usage-for-searching.html

Basically those binary files get copied into RAM for much faster scanning and reducing I/O. You get a hint in the above link how searching with some parameters can force Lucene to "skip terms in searching" Hence, where that data structure can be used.

Lucene is open source, so you can see the code for yourself what is being used in Java or Lucene.NET for the C# implementation.

0人赞添加讨论(0) 举报

祖国的老花朵

3楼-- · 2019-02-14 20:54

see To accelerate posting list skips, Lucene uses skip lists

0人赞添加讨论(0) 举报