I have a lot of data in my postgres database( on a remote). This is the data of the past 1 year, and I want to push it to elasticsearch now.
The data has a time field in it in this format 2016-09-07 19:26:36.817039+00
.
I want this to be the timefield(@timestamp
) in elasticsearch. So that I can view it in kibana, and see some visualizations over the last year.
I need help on how do I push all this data efficiently. I cannot get that how do I get all this data from postgres.
I know we can inject data via jdbc plugin, but I think I cannot create my @timestamp
field with that.
I also know about zombodb but not sure if that also gives me feature to give my own timefield.
Also, the data is in bulk, so I am looking for an efficient solution
I need help on how I can do this. So, suggestions are welcome.
I know we can inject data via jdbc plugin, but I think I cannot create
my @timestamp
field with that.
This should be doable with Logstash. The first starting point should probably be this blog post. And remember that Logstash always consists of 3 parts:
- Input: JDBC input. If you only need to import once, skip the
schedule
otherwise set the right timing in cron syntax.
- Filter: This one is not part of the blog post. You will need to use the Date filter to set the right
@timestamp
value — adding an example at the end.
- Output: This is simply the Elasticsearch output.
This will depend on the format and field name of the timestamp value in PostgreSQL, but the filter part should look something like this:
date {
match => ["your_date_field", "dd-mm-YYYY HH:mm:ss"]
remove_field => "your_date_field" # Remove now redundant field, since we're storing it in @timestamp (the default target of date)
}
If you're concerned with the performance:
- You will need to set the right
jdbc_fetch_size
.
- Elasticsearch output is batched by default.