SolrJ - Indexing multiple classes and ensuring doc

2019-09-06 07:58发布

问题:

I want to use SolrJ for indexing a set of Java classes. Each class instance is determined by its id which is unique within a class. However, by using the Solr @Field annotation for making Solr documents from these classes it turns out that this annotation doesn't guarantee uniqueness of the created documents stored in the Solr index (same id values may belong to multiple classes).

I tried combining the annotation approach with the Solr UUID data type for generating unique id values into a specified field in the solr schema, but with no success.

As a result, I created a simple annotation mechanism not so different from the SolrJ one, which guarantees uniqueness across multiple classes. This is done by combining object class name and its id to get a sort of UUID which is then stored in the Solr schema.

I'm not sure if I'm not missing something, so I would like to know if the working solution described above is good enough for my case or if there are any cleaner/better alternatives.

回答1:

I think this is a valid approach. We are doing something similar with multiple indexes at our location. For example we have 4 different types of items in our database that we are loading into a common schema in the index and we prefix the database table id with the first two unique letters of the type to ensure that it will be unique.

Also IMO, indexing multiple distinct types in one index is really a preference and not a rule of thumb as indicated in the links below

  • Single schema versus multiple schemas in solr for different document types
  • Running Multiple Indexes


回答2:

Typically one POJO will correspond to one schema and one Solr core. I am not sure why you would want to index different POJOs into one Solr core.

But with that said, your class name approach should work fine. Else you can declare a static CLASS_ID field in each one of your classes, keep them different for different classes and form the Solr document ID by concatenating like id:CLASS_ID.