I was following step #7 (Use Kafka Connect to import/export data) at this link:
http://kafka.apache.org/documentation.html#quickstart
It was working well until I deleted the 'test.txt' file. Mainly because that's how log4j files would work. After certain time, the file will get rotated - I mean - it will be renamed & a new file with the same name will start getting written to.
But after, I deleted 'test.txt', the connector stopped working. I restarted connector, broker, zookeeper etc, but the new lines from 'test.txt' are not going to the 'connect-test' topic & therefore are not going to the 'test.sink.txt' file.
How can I fix this?
The connector keeps tabs of its "last location read from a file", so in case it crashes while reading the file, it can continue where it left off.
The problem is that you deleted the file without resetting the offsets to 0, so it basically doesn't see any new data since it waits for new data to show starting at a specific character count from the beginning...
The work-around if to reset the offsets. If you are using connect in stand-alone mode, the offsets are stored in /tmp/connect.offsets by default, just delete them from there.
In the long term, we need a better file connector :)