Find list of terms indexed by Lucene

2019-02-04 23:27发布

Is it possible to extract the list of all the terms in a Lucene index as a list of strings? I couldn't find that functionality in the doc. Thanks!

标签: lucene
2条回答
我想做一个坏孩纸
2楼-- · 2019-02-04 23:37

In Lucene 4 (and 5):

 Terms terms = SlowCompositeReaderWrapper.wrap(directoryReader).terms("field"); 

Edit:

This seems to be the 'correct' way now (Lucene 6 and up):

LuceneDictionary ld = new LuceneDictionary( indexReader, "field" );
BytesRefIterator iterator = ld.getWordsIterator();
BytesRef byteRef = null;
while ( ( byteRef = iterator.next() ) != null )
{
    String term = byteRef.utf8ToString();
}
查看更多
forever°为你锁心
3楼-- · 2019-02-04 23:46

Lucene 3:

查看更多
登录 后发表回答