Solr will use Highlighter instead of FastVectorHig

2019-09-19 02:39发布

问题:

Hi I'm developing rails app with Solr 4.1 search engine,

When I add highlighting to searchSolr start spaming the tomcat6 log with this warning:

Jan 29, 2015 12:13:38 PM org.apache.solr.highlight.DefaultSolrHighlighter useFastVectorHighlighter
WARNING: Solr will use Highlighter instead of FastVectorHighlighter because *Field_Name* field does not store TermPositions and TermOffsets.

Example of my field in schema.xml:

<field name="name" type="text" indexed="true" stored="true" multiValued="true"/>

What I found in documentation:

The Standard Highlighter is the swiss-army knife of the highlighters. It has the most sophisticated and fine-grained query representation of the three highlighters. For example, this highlighter is capable of providing precise matches even for advanced queryparsers such as the surround parser. It does not require any special datastructures such as termVectors, although it will use them if they are present. If they are not, this highlighter will re-analyze the document on-the-fly to highlight it. This highlighter is a good choice for a wide variety of search use-cases. FastVector Highlighter

The FastVector Highlighter requires term vector options (termVectors, termPositions, and termOffsets) on the field, and is optimized with that in mind. It tends to work better for more languages than the Standard Highlighter, because it supports Unicode breakiterators. On the other hand, its query-representation is less advanced than the Standard Highlighter: for example it will not work well with the surround parser. This highlighter is a good choice for large documents and highlighting text in a variety of languages.

And FastVector highlighting provide a faster search: http://solr.pl/en/2011/06/13/solr-3-1-fastvectorhighlighting/.

But what the difference in configuration of Highlighting and FastVectorHighlighting?

And does users see the difference in search results when I change Highlighting to FastVectorHighlighting?

All what I need to do to turn on FastVectorHighlighting is to add termVectors="on" termPositions="on" termOffsets="on"/> into each field in schema.xml ? Like:

<field name="name" type="text" indexed="true" stored="true" multiValued="true" termVectors="on" termPositions="on" termOffsets="on"/>

Also I found this problem in Solr documentation: https://issues.apache.org/jira/browse/SOLR-5544

But I still don't know how to can I fix a WARNING, cause size of my log file increasing on 500 MB each second! it is critical, cause seach server'll stop if there'll be no free space on volume.

Please, help.

回答1:

I found fields in my schema.xml, which include termVectors="true" attribute without termPositions="true" termOffsets="true".

It was the reason of warnings.

So, what I made:

  • added termPositions="true" termOffsets="true" to fields in schema.xml wihch have only termVectors="true" attribute
  • added termVectors="true" termPositions="true" termOffsets="true" to each field wich I found in warnings: ("...field phone does not store positions and offsets..." e.g.)

After I ran reindexing, but it does not fix "spam "warnings in logs.

Reason of this problem - Sold does not see schema.xml updates, while tomcat is not restarted.

So, I restart tomcat:

  • sudo /etc/init.d/tomcat6 restart.

  • I kick off reindexing again, cause all highlighting was lost

Many thanks @chefe for help!