import csv into elasticsearch

2019-02-02 13:03发布

问题:

I'm doing "elastic search getting started" tutorial. Unfortunatelly this tutorial doesn't cover first step which is importing csv database into elasticsearch.

I googled to find solution but it doesn't work unfortunatelly. Here is what I want to achieve and what I have:

I have a file with data which I want to import (simplified)

id,title
10,Homer's Night Out
12,Krusty Gets Busted

I would like to import it using logstash. After research over the internet I end up with following config:

input {
    file {
        path => ["simpsons_episodes.csv"]
        start_position => "beginning"
    }
}

filter {
    csv {
        columns => [
            "id",
            "title"
        ]
    }
}

output {
    stdout { codec => rubydebug }
    elasticsearch {
        action => "index"
        hosts => ["127.0.0.1:9200"]
        index => "simpsons"
        document_type => "episode"
        workers => 1
    }
}

I have a trouble with specifying document type so once data is imported and I navigate to http://localhost:9200/simpsons/episode/10 I expect to see result with episode 10.

回答1:

Good job, you're almost there, you're only missing the document ID. You need to modify your elasticsearch output like this:

elasticsearch {
    action => "index"
    hosts => ["127.0.0.1:9200"]
    index => "simpsons"
    document_type => "episode"
    document_id => "%{id}"             <---- add this line
    workers => 1
}

After this you'll be able to query episode with id 10

GET http://localhost:9200/simpsons/episode/10


回答2:

I'm the author of moshe/elasticsearch_loader
I wrote ESL for this exact problem.
You can download it with pip:

pip install elasticsearch-loader

And then you will be able to load csv files into elasticsearch by issuing:

elasticsearch_loader --index incidents --type incident csv file1.csv

Additionally, you can use custom id file by adding --id-field=document_id to the command line