In Production environment, should Solr be setup on every server possible including the one having Zookeeper? Talking about External Zookeeper
Total Servers : 5
Case 1:
Solr on all 5 servers. Zookeeper on 3 servers.
Case 2:
Solr on 2 servers. Zookeeper on 3 servers.
Case 3:
Solr on 5 servers. Zookeeper on 5 servers.
What is the best practice? What are the advantages of using one case over another? I have read that it's better to have Zookeeper in a separate server.
At any point of time zookeeper instance should be in 2n+1 count. In your case you can go with maximum 5 since you have 5 servers. i.e. Solr on 5 servers and zookeeper on 5 Solr servers. But the original sizing can be determined only based on index size,query complexity, approximate query hit count for a minute and compromised result time.
It is very common to run both Solr and ZooKeeper on the same node. ZK doesn't really require much in the way of resources.
If the ZK database and the Solr installation are on separate physical disks, ZK performance would be better. But unless the cloud is enormous, even that shouldn't really be necessary. With five machines, it is unlikely to be an enormous cloud. Yuo're not planning on hundreds or thousands of collections, are you?
For REALLY optimal operation, ZK would be running on separate machines, but I personally would not do it that way unless I had a trio of really small servers that weren't needed for something else.
The smallest possible SolrCloud installation with high availability would be three machines, one of which is much smaller than the others. The two large machines would run both Solr and ZK (as separate processes), the third would run ZK only. If the third machine is the same as the others, it could run both as well.
I would be more concerned about the total number of Solr servers that I need to support my search requirements than about whether to run ZK separately.