In what condition i can use solr core

2019-04-13 21:23发布

问题:

I am using solr version 3.0.1, and I am about to change to solr 4.6.0. Usually I just use solr without defining core (I think solr 3.0.1 doesn't have core yet). And now I want to upgrade my solr to version 4.6.0, there is something new on it. So i have 3 questions:

  1. What exactly solr core is?
  2. When i should use solr core?
  3. Is it right that each solr core is like a table in a (relational) database? That is, can I save different type of data in different core?

Thanks in advance.

回答1:

A core is basically an index with a given schema and will hold a set of documents.

You should use different cores for different collections of documents, it doesn't mean you should store different kind of documents in different indexes.

Some examples:

  • you could have same documents in different languages stored on different cores and select the core based on configured language;
  • you could have different type of documents stored in different cores to organize them physically separated;
  • but at the same time you could have different documents stored on the same index and differentiate them by a field value;

it really depends on your use-case.



回答2:

You have to think up-front about what type of queries you are going to execute against you Solr index. You then lay down your schema of a core or several cores accordingly.

If you for example execute some JOIN queries on your relational DB, those won't be very efficient (if at all possible) with lots of documents in the SOLR index, because it is NoSQL world (here read as: non-relational). In such a case you might need to duplicate your data from several DB tables into one core's schema.

As Francisco has already mentioned physically core is represented as an independent entity with its own schema, config and index data.

One caution with multi-core setup: all the cores configured under the same container instance will hence share the same JVM. This means you should be careful with the amount of data you store on those cores. Lucene, which is an indexing engine inside Solr, has really neat and fast (de)compression algorithms (in versions 4.x) so disk can leave for longer, but JVM heap is something to care about.

The goodies of cores coupled with the Solr admin UI are things like:

core reload after schema / solrconfig changes
core hot swap (if you have a live core serving queries you can hot swap it with a new core with same data and some modifications)
core index optimization
core renaming