Solr Collection vs Cores

2019-01-30 18:09发布

I struggle with understanding the difference between collections and cores. If I understand it correctly, cores are multiple indexes. Collection consists of cores, so essentially they share the same logic in separation, i.e. separate cores and collections have separate end-points.

I have the following scenario. I create a backend for cloud service for several online shops. Each shop has a set of products, to which customers can add reviews. I want to index static data (product information) separately from dynamic information(reviews) so I can improve performance.

How can I best separate in Solr???

标签: solr lucene
6条回答
兄弟一词,经得起流年.
2楼-- · 2019-01-30 18:45

As per my understanding:

In distributed search,

Collection is a logical index spread across multiple servers. Core is that part of server which runs one collection.

In non-distributed search,

Single server running the Solr can have multiple collections and each of those collection is also a core. So collection and core are same if search is not distributed.

Summary

  1. Collection per server is called a core.
  2. Collection is same as an index.
  3. One Solr server can have many cores.
  4. Collection is a logical index (Example usage for multiple collections: Say two teams in same group are not big enough to justify a full Solr server of their own. But they also do not want to mix their data in a single index. They can then create separate collections/indexes which will keep their data separate).
  5. Its better to use a separate Solr Cloud rather than create collections if the data for a collection is big enough (not sure, comments please?)
查看更多
在下西门庆
3楼-- · 2019-01-30 18:47

Core

In Solr, a core is composed of a set of configuration files, Lucene index files, and Solr’s transaction log.

a Solr core is a uniquely named, managed, and configured index running in a Solr server; a Solr server can host one or more cores. A core is typically used to separate documents that have different schemas

collection

Solr also uses the term collection, which only has meaning in the context of a Solr cluster in which a single index is distributed across multiple servers.

SolrCloud introduces the concept of a collection, which extends the concept of a uniquely named, managed, and configured index to one that is split into shards and distributed across multiple servers.

查看更多
Juvenile、少年°
4楼-- · 2019-01-30 18:50

Single instance

On a single instance, Solr has something called a SolrCore that is essentially a single index. If you want multiple indexes, you create multiple SolrCores.

Solr Cloud

With SolrCloud, a single index can span multiple Solr instances. This means that a single index can be made up of multiple SolrCore's on different machines. We call all of these SolrCores that make up one logical index a collection.

A collection is a essentially a single index that spans many SolrCore's, both for index scaling as well as redundancy. If you wanted to move your 2 SolrCore Solr setup to SolrCloud, you would have 2 collections, each made up of multiple individual SolrCores.

查看更多
劫难
5楼-- · 2019-01-30 18:53

This explains the use of cores and collections.

Single instance

When dealing with a single solr instance you query to cores.

The admin UI of a single Solr instance has no collection selector:

Single Solr Instance

Solr Cloud

When dealing with Solr Cloud you query to collections. The collections are organized in different cores (replicas, shards) on different solr instances.

The admin UI of a Solr Cloud instance has a collection and core selector. But cores are technically instances, here:

Solr Cloud instance

查看更多
▲ chillily
6楼-- · 2019-01-30 19:06

From Solr Wiki:

Collections are made up of one or more shards. Shards have one or more replicas. Each replica is a core. A single collection represents a single logical index.

查看更多
地球回转人心会变
7楼-- · 2019-01-30 19:11

From the SolrCloud Documentation

Collection: A single search index.

Shard: A logical section of a single collection (also called Slice). Sometimes people will talk about "Shard" in a physical sense (a manifestation of a logical shard)

Replica: A physical manifestation of a logical Shard, implemented as a single Lucene index on a SolrCore

Leader: One Replica of every Shard will be designated as a Leader to coordinate indexing for that Shard

SolrCore: Encapsulates a single physical index. One or more make up logical shards (or slices) which make up a collection.

Node: A single instance of Solr. A single Solr instance can have multiple SolrCores that can be part of any number of collections.

Cluster: All of the nodes you are using to host SolrCores.

So basically a Collection (Logical group) has multiple cores (physical indexes).

Also, check the discussion

查看更多
登录 后发表回答