Kinesis agent not parsing the file

2019-07-27 04:32发布

问题:

I have the following in the agent.json

{
  "cloudwatch.emitMetrics": true,
  "kinesis.endpoint": "",
  "firehose.endpoint": "", 
  "flows": [
    {
      "filePattern": "/home/ec2-user/ETLdata/contracts/Delta.csv",
      "kinesisStream": "ETL-rawdata-stream",
      "partitionKeyOption": "RANDOM",
      "dataProcessingOptions": [
        {
    "optionName": "CSVTOJSON",
    "customFieldNames": [ "field1", "field2"],
    "delimiter": ","
        }
      ] 
    }
  ]
}

When I add the specified file to the folder, literally nothing happens. I only see the below in the logs. Why is it not parsing the file at all. Does anyone have any idea?

update: It works when I make the file pattern as /tmp/delta.csv. Looks like a permission issue but no errors in the logs.

Tailer Progress: Tailer has parsed 0 records (0 bytes), transformed 0 records, skipped 0 records, and has successfully sent 0 records to destination. 2017-06-22 18:12:03.671+0000 (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 0 records parsed (0 bytes), and 0 records sent successfully to destinations. Uptime: 300020ms

回答1:

I had a similar issue, I was able to solve it by doing the following:

  1. moving the data to be sent to the kinesis firehose stream (a bunch of CSV files) from ~/ec2-user/out-data to another directory:

    mv *.csv /tmp/out-data
    
  2. edit the agent.json file so that the agent starts reading at the beginning of the file- here is my agent.json file:

    {
      "cloudwatch.emitMetrics": true,
      "firehose.endpoint": "firehose.eu-west-1.amazonaws.com",
      "flows": [
        {
          "filePattern": "/tmp/out-data/trx_headers_2017*",
          "deliveryStream": "TestDeliveryStream",
          "initialPosition": "START_OF_FILE"
        }
      ]
    }
    

my guess is that your Delta.csv file is being written to so the kinesis agent is checking the end of the file and finding no new records, if you add the "initialPosition" : "START_OF_FILE" fix it will start parsing at the beginning of file.



回答2:

Moving your data to /tmp/logs, /var/logs will fix the issue. Do not leave data under /ec2-user.

Link to the issue: https://github.com/awslabs/amazon-kinesis-agent/issues/58