Solrcloud multicore configuration

2019-05-17 18:49发布

问题:

I have a standalone Solr instance with 4 different cores working fine using the embedded Jetty server. I configured the cores for v4.10.3 but since I moved to v5.1 and all seems to work fine without any changes.

Before going into production, I need to set it up as a Solrcloud installation, initially with 2 nodes (two different machines) with 1 shard per node (to keep it simple). I have been trying to get it to work but I have not been able to do it.

I tried to run it like this (I think using start.jar is not the preferred way), having read that Solr will look for multiple configured cores in any nested folders (which works for standalone Solr):

java -DzkRun -DnumShards=2 -Dbootstrap_confdir=solr/ -jar start.jar

but that did not work, it does not find the needed solrconfig.xml file.

My Solr directory looks like this:

My solr.xml file is the standard one:

<solr>

  <solrcloud>
    <str name="host">${host:}</str>
    <int name="hostPort">${jetty.port:8983}</int>
    <str name="hostContext">${hostContext:solr}</str>
    <int name="zkClientTimeout">${zkClientTimeout:30000}</int>
    <bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool>
  </solrcloud>

  <shardHandlerFactory name="shardHandlerFactory"
    class="HttpShardHandlerFactory">
    <int name="socketTimeout">${socketTimeout:0}</int>
    <int name="connTimeout">${connTimeout:0}</int>
  </shardHandlerFactory>

</solr>

Each core looks like this:

And the core.properties just has the name of the core:

name=users

My question is:

  • How do I start Solrcloud v5.1 so the 4 cores are picked up?

回答1:

In SolrCloud each of your Core will become a Collection.

Each Collection will have its own set of Config Files and data.

You might find this helpful Moving multi-core SOLR instance to cloud

Solr 5.0 (onwards) has made some changes on how to create a SolrCloud setup with shards, and how to add collections etc.

Everything listed below is my understanding of the Solr Reference Guide. I will highly recommend going through it thoroughly. https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide

I setup my servers on a Linux(CentOS) server, but the steps can be used to setup solr on Windows system also. For example, there is solr.cmd file instead of solr.sh

Here are the steps I followed to create a simple two shard SolrCloud setup.

  • Setup the zookeeper ensemble. I am assuming you are trying to use the embedded ZK in solr. For a production system, it is highly recommended to create a external ZK ensemble. You can find steps to install a external ensemble in this section of reference guid

  • Download solr to /opt folder.

  • Extract the install file ONLY.

    tar xzf solr-5.0.0.tgz solr-5.0.0/bin/install_solr_service.sh --strip components=2

  • This command will install solr on your system

    sudo bash ./install_solr_service.sh solr-5.0.0.tgz

  • The above command will create a new user called "solr" if it does not exist.

  • These are some of the default options it will assume. You can view this in /var/solr/solr.in.sh . This is the include file where you can specify other options.

        * SOLR_PID_DIR=/var/solr
        * SOLR_HOME=/var/solr/data
        * LOG4J_PROPS=/var/solr/log4j.properties
        * SOLR_LOGS_DIR=/var/solr/logs
        * SOLR_PORT=8983
    
  • Running install_solr_service start in the above step will start a solr server. Stop the server using service solr stop before doing any of the changes below.

  • Change Java heap value

    SOLR_HEAP="3g"

    This will set Xmx and Xms as 3GB . (optional) This variable is not mentioned in the solr.in.sh file in Solr 5.1 . Its a bug and has been fixed, will be released in next version.

  • SOLR_MODE="solrcloud" Required

    this is what you need start solr in cloud mode.

  • ZK_HOST=ZK1:2181,ZK2:2181,ZK3:2181 Required

    (replace zk with you zookeeper host names)

  • Running the install_solr_service.sh command also creates a init.d file as /etc/init.d/solr

  • This init.d script in turn calls the /opt/solr/bin/solr script and includes all the variables from /var/solr/solr.in.sh

  • Once you have made the above changes, start solr again using service solr start

  • You can check the status using service solr status

Creating Collections Shards and Replicas - All shard, collection, replica related commands are now made using Collections API.

  • Before creating a collection a config folder should be uploaded to ZK . This can be done using the zkcli.sh script in the solr folder (not on the zookeeper servers) Folder: /opt/solr/server/scripts/cloud-scripts

  • The command to upload the confg folder is

sh zkcli.sh -cmd upconfig -zkhost zk1:2181,zk2:2181,zk3:2181 -confname yourconfigname -confdir /var/solr/configs/conf

You will run this command 4 times for each of your 4 cores, each time changing the path of the conf folder and config name.

  • This will upload all the config files in conf folder with the name 'yourconfigname' in zookeeper.

Creating a collection I used the following command to create a new collection.

http://1.1.1.1:8983/solr/admin/collections?action=CREATE&name=yourcollectionname&numShards=2&replicationFactor=1&maxShardsPerNode=1&createNodeSet=1.1.1.1:8983_solr,2.2.2.2:8983_solr&collection.configName=yourconfigname

Happy Searching!



回答2:

SolrCloud does not use configuration files stored in core conf directory. To make your cores visible in SolrCloud structure you need to upload the configuration files to ZooKeeper and keep it manage the files to you. All the time a Solr instance comes up it get the configuration files stored in ZooKeeper. This way your cores doesn't need to have conf directory to work. To upload your core configuration files to ZooKeeper follow the link bellow and take a look at Upload a configuration directory

https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities