I am working on NIFI Data Flow where my usecase is fetch mysql table data and put into hdfs/local file system.
I have built a data flow pipeline where i used querydatabaseTable processor ------ ConvertRecord --- putFile processor.
My Table Schema ---> id,name,city,Created_date
I am able to receive files in destination even when i am inserting new records in table
But, but ....
When i am updating exsiting rows then processor is not fetching those records looks like it has some limitation.
My Question is ,How to handle this scenario? either by any other processor or need to update some property.
QueryDatabaseTable Processor needs to be informed which columns it can use to identify new data.
A serial
id
orcreated
timestamp is not sufficient.From the documentation:
Maximum-value Columns:
Judging be the table scheme, there is no sql-way of telling whether data was updated.
There are many ways to solve this. In your case, the easiest thing to do might be to rename column
created
tomodified
and set to now() on updates or to work with a second timestamp column.So for instance
is the new column added. In the processor you use the
stamp_updated
column to identify new dataDon't forget to set
Maximum-value Columns
to those columns.So what I am basically saying is: