Can we have cassandra only nodes and solr enabled

2019-04-12 03:15发布

问题:

I just started with solr and would like your suggestion in below scenario. We have 2 data centers with 3 nodes in each data center(both in different aws regions for location advantage). We have a requirement for which they asked me if we can have 2 solr nodes in each data center. so it will be 2 solr nodes and 1 cassandra only node in each data center. I want to understand if its fine to have this kind of setup and I am little confused whether solr nodes will have data on it along with the indexes? does all 6 nodes share data and 4 solr nodes will have indexes on it along with data? Kindly provide some information on this. Thanks.

回答1:

Short answer is no, this will not work. If you turn on DSE Search on one node in a DC you need to turn it on for all the nodes in the DC.

But why??

DSE Search builds lucene indexes on the data that is stored local to a node. Say you have a 3 node DC with RF1 (the node only has 1/3rd of the data) and you only turn on search on one of the nodes. 1/3 of your search queries will fail.

So I should just turn search on everywhere?

If you have a relatively small workloads with loose SLA's (both c* and search) and/or if you are over provisioned, you may be fine turning on Search on your main Cassandra nodes. However, in many cases with heavy c* workloads and tight SLA's, Search queries will negatively affect cassandra performance (because they are contending against the same hardware).

I need search nodes in both Physical DC's

If you want search enabled only in two out of your three nodes in a physical DC, the only way to do this is to actually split up your physical DC into two logical DC's. In your case you would have:

US - Cassandra

US - Search

Singapore - Cassandra

Singapore - Search

This gives you geographic locality for your search and c* queries and also provides workload isolation between your c* and search workloads since they contend against different OS Subsystems.