How to use TermVector Lucene 4.0

2019-02-02 14:56发布

In the indexing method I use the following line:

Field contentsField = new Field("contents", new FileReader(f), Field.TermVector.YES);

However, in Lucene 4.0 this constructor is deprecated and new TextField should be used instead of new Field.

But the problem with TextField is that it don't accept TermVector in its constructors.

Is there a way to include the Term Vector in my indexing in Lucene 4.0 with the new constructors?

Thanks

2条回答
We Are One
2楼-- · 2019-02-02 15:42

I had the same problem, so I just simply created my own Field:

public class VecTextField extends Field {

/* Indexed, tokenized, not stored. */
public static final FieldType TYPE_NOT_STORED = new FieldType();

/* Indexed, tokenized, stored. */
public static final FieldType TYPE_STORED = new FieldType();

static {
    TYPE_NOT_STORED.setIndexed(true);
    TYPE_NOT_STORED.setTokenized(true);
    TYPE_NOT_STORED.setStoreTermVectors(true);
    TYPE_NOT_STORED.setStoreTermVectorPositions(true);
    TYPE_NOT_STORED.freeze();

    TYPE_STORED.setIndexed(true);
    TYPE_STORED.setTokenized(true);
    TYPE_STORED.setStored(true);
    TYPE_STORED.setStoreTermVectors(true);
    TYPE_STORED.setStoreTermVectorPositions(true);
    TYPE_STORED.freeze();
}

// TODO: add sugar for term vectors...?

/** Creates a new TextField with Reader value. */
public VecTextField(String name, Reader reader, Store store) {
    super(name, reader, store == Store.YES ? TYPE_STORED : TYPE_NOT_STORED);
}

/** Creates a new TextField with String value. */
public VecTextField(String name, String value, Store store) {
    super(name, value, store == Store.YES ? TYPE_STORED : TYPE_NOT_STORED);
}

/** Creates a new un-stored TextField with TokenStream value. */
public VecTextField(String name, TokenStream stream) {
    super(name, stream, TYPE_NOT_STORED);
}

}

Hope this helps

查看更多
淡お忘
3楼-- · 2019-02-02 15:51

TextField is a convenience class for users who need indexed fields without term vectors. If you need terms vectors, just use a Field. It takes a few more lines of code since you need to create an instance of FieldType first, set storeTermVectors and tokenizer to true and then use this FieldType instance in Field constructor.

查看更多
登录 后发表回答