Indexing speed of elasticsearch for 10 million eve

I am trying to figure out why elasticsearch is so slow at indexing. I am unsure if it is a limitation of elasticsearch itself or not but I will share what I have so far.

I have a single elasticsearch node and a logstash instance running on a box. My documents have about 15 fields and I have an elastic search mapping setup with the correct types (although I have tried without the mapping and get pretty much identical results).

I am indexing roughly 8 - 10 million events at a time and have taken the following approaches.

bulk api with the following format (I converted the csv to JSON and placed it into a file which I curl in

{"create" : {}}
{"field1" : "value1", "field2" : "value2 .... }
{"create" : {}}
{"field1" : "value1", "field2" : "value2 .... }
{"create" : {}}
{"field1" : "value1", "field2" : "value2 .... }

I have also tried logstash using both a tcp input with the original csv or using a file listener and cat the csv to the end of a file logstash is listening to.

All three of these methods seem to ingest around 10,000 events per second which is very slow.

Am I doing something wrong? Should I be explicitly assigning an id in my bulk ingest rather than letting it auto generate one?

When ingesting through the bulk API I have split the events up into 50,000 and 100,000 event files and ingested each separately.

标签： elasticsearch logstash

2条回答

Root（大扎）

2楼-- · 2019-06-08 10:27

I recommend this blog. Adjusting the following parameters should help during bulk indexing, but once you are done, reduce refresh_interval.

 index.store.type: mmapfs
 indices.memory.index_buffer_size: 30%
 index.translog.flush_threshold_ops: 50000
 refresh_interval: 30s

0人赞添加讨论(0) 举报

Emotional °昔

3楼-- · 2019-06-08 10:29

Youll find I done some research on this here, you can download the Indexing Scripts file and this has some useful scripts to maximise indexing performance. It really does vary on hardware and optimisation of Elasticsearch for indexing. I.e. Removal of replica nodes etc.

Hope this helps you somewhat.

0人赞添加讨论(0) 举报

Indexing speed of elasticsearch for 10 million eve

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间