I am new to elasticsearch. I have an Elasticsearch index of about 300,000 items. For each of the 60 million records in another table, I need to make a complex query to this ES index.
Right now, it is extremely slow (making 1000 queries would take 200 seconds). I need advice on how to configure my elasticsearch cluster to handle a large volume of queries.
My server:
8 core
8GB ram
SSD Hardware
I want to config elasticsearch to handle 1000 concurrent search requests from ruby. (I want to search 1000 items in parallel).
I have try with the default config
I think by default, elasticesearch can only handle about 10-20 concurrent search request. It use little cpu and ram. Therefore, I think I could improve it.
I could only run 100 threads from ruby to search 1000 items and it takes about 200 seconds. If I increase to 1000 threads from ruby, ES returns timeout error message.
I run a master node with
ES_HEAP_SIZE=2G
indices.fielddata.cache.size: 1g
threadpool:
search:
type: fixed
size: 200
queue_size: 400
shares: 5
replicas: 1
Running 100 threads from ruby to search 1000 items still takes 200s.
I add 3 new nodes as data nodes on this server.
Running 100 threads from ruby to search 1000 items still takes 200s or more.
I google and read from some posts. People say that create more shards will make search become slow.
How can I improve my search query?
Many thanks!