I struggle with understanding the difference between collections and cores. If I understand it correctly, cores are multiple indexes. Collection consists of cores, so essentially they share the same logic in separation, i.e. separate cores and collections have separate end-points.
I have the following scenario. I create a backend for cloud service for several online shops. Each shop has a set of products, to which customers can add reviews. I want to index static data (product information) separately from dynamic information(reviews) so I can improve performance.
How can I best separate in Solr???
As per my understanding:
In distributed search,
Collection is a logical index spread across multiple servers. Core is that part of server which runs one collection.
In non-distributed search,
Single server running the Solr can have multiple collections and each of those collection is also a core. So collection and core are same if search is not distributed.
Summary
Core
In Solr, a
core
is composed of a set of configuration files, Lucene index files, and Solr’s transaction log.a Solr core is a uniquely named, managed, and configured index running in a Solr server; a Solr server can host one or more cores. A core is typically used to separate documents that have different schemas
collection
Solr also uses the term
collection
, which only has meaning in the context of a Solr cluster in which a single index is distributed across multiple servers.SolrCloud introduces the concept of a
collection
, which extends the concept of a uniquely named, managed, and configured index to one that is split into shards and distributed across multiple servers.Single instance
On a single instance, Solr has something called a SolrCore that is essentially a single index. If you want multiple indexes, you create multiple SolrCores.
Solr Cloud
With SolrCloud, a single index can span multiple Solr instances. This means that a single index can be made up of multiple SolrCore's on different machines. We call all of these SolrCores that make up one logical index a collection.
A collection is a essentially a single index that spans many SolrCore's, both for index scaling as well as redundancy. If you wanted to move your 2 SolrCore Solr setup to SolrCloud, you would have 2 collections, each made up of multiple individual SolrCores.
This explains the use of cores and collections.
Single instance
When dealing with a single solr instance you query to
cores
.The admin UI of a single Solr instance has no collection selector:
Solr Cloud
When dealing with Solr Cloud you query to
collections
. The collections are organized in different cores (replicas, shards) on different solr instances.The admin UI of a Solr Cloud instance has a collection and core selector. But cores are technically instances, here:
From Solr Wiki:
From the SolrCloud Documentation
So basically a Collection (Logical group) has multiple cores (physical indexes).
Also, check the discussion