Partial schema sharing - is it possible?

2019-07-30 02:56发布

问题:

We have a large number of Solr cores, which all share the same fields but have differing definitions for certain field types (e.g. for different languages).

For instance, in the following example, I have a full_text field of type text that exists in two different cores but I apply different filters to the text field in each core.

Core 1

<fields>
    <field name="full_text" type="text" indexed="true" />
</fields>
<types>
    <fieldType name="text" class="solr.TextField">
         <analyzer type="index">
             <tokenizer class="solr.StandardTokenizerFactory" />
             <filter class="solr.SnowballPorterFilterFactory" language="English"/>
         </analyzer>
    </fieldType>
</types>

Core 2

<fields>
    <field name="full_text" type="text" indexed="true" />
</fields>
<types>
    <fieldType name="text" class="solr.TextField">
         <analyzer type="index">
             <tokenizer class="solr.StandardTokenizerFactory" />
             <filter class="solr.SnowballPorterFilterFactory" language="Portuguese"/>
         </analyzer>
    </fieldType>
</types>

As we have 20+ cores, maintaining and updating the field set is difficult.

Is there any way to inherit the fields from some parent and override (or apply some kind of transform to) the field type definitions? My search has turned up nothing so far.

回答1:

1. Extract the common field types

You can use XInclude for this matter. You will need to write the field types into an own XML file and then include that file within those several schema.xml files.

Paige Cook has answered this for a similar question How to include another XML file from within a Solr schema.xml?

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="foo" version="1.5">
  <fields>
    <field name="full_text" type="text" indexed="true" />
  </fields>
  <xi:include href="/path/to/field_types.xml" xmlns:xi="http://www.w3.org/2001/XInclude"/>
</schema>

Then in your new field_types.xml you would have the type definition

<types>
    <fieldType name="text" class="solr.TextField">
         <analyzer type="index">
             <tokenizer class="solr.StandardTokenizerFactory" />
             <filter class="solr.SnowballPorterFilterFactory" language="Portuguese"/>
         </analyzer>
    </fieldType>
</types>

You will need Solr 4.X or 5.X as this has been fixed with SOLR-3087.

2. Use core.properties to define the variable part

As one can see in your sample, each core seems to have a language it is designed for. You can put this language into the core.properties. This is the properties file that resides within each core, when following the new core configuration approach that has been introduced with Solr 4.4.

Within that core.properties you can introduce new properties, e.g. a property called typeLanguage

name=core-1
loadOnStartup=false
typeLanguage=English

This property then in turn can be used as parameter in the previously defined field type using the ${ xxx } notation, like language="${typeLanguage}"

<fieldType name="text" class="solr.TextField">
     <analyzer type="index">
         <tokenizer class="solr.StandardTokenizerFactory" />
         <filter class="solr.SnowballPorterFilterFactory" language="${typeLanguage}" />
     </analyzer>
</fieldType>


标签: solr