I've a kafka topic and each message in the topic has lat/lon and event timestamp. Created a stream referring to topic and would like to calculate distance between 2 point using geo_distance . example
GpsDateTime lat lon
2016-11-30 22:38:36, 32.685757, -96.735942
2016-11-30 22:39:07, 32.687347, -96.732841
2016-11-30 22:39:37, 32.68805, -96.729726
I would like to create a new stream on the above stream and enrich it with distance.
GpsDateTime lat lon Distance
2016-11-30 22:38:36, 32.685757, -96.735942 0
2016-11-30 22:39:07, 32.687347, -96.732841 0.340
2016-11-30 22:39:37, 32.68805, -96.729726 0.302
Is it possible to achieve desired results using KSQL ? Or how to refer previous message while processing new message?
First, do these readings come from some sort of device? If so do you have a unique ID (UUID) for them? I would put that into your stream, so it would like like
UUID, GpsDateTime, lat, lon
.You will need to create a fairly basic Kafka Streams app. Within this app you will be storing the most recent reading from your stream into a StoreBuilder. Then when a new message is received from Kafka you will retrieve this latest value, do your computation and then store the new lat,long values into the StoreBuilder.
Of course I'm not clear if you are wanting to only ever have one lat,long value and all your subsequent values are computed from the 1st reading. Or if you want to have a rolling compute where you are always comparing the distance between the last and current reading.
Anyway, you can see this code in practice at: https://github.com/confluentinc/kafka-streams-examples/blob/5.0.0-post/src/test/java/io/confluent/examples/streams/StateStoresInTheDSLIntegrationTest.java
This example is a word count example, but can be quickly converted for your use case.
The static final class WordCountTransformerSupplier (line 78) would become your LatLongDistanceComputation.
You would create the StoreBuilder (line 154) with proper types (whatever you are storing your lat/lon as).
Line 165 is where the item is actually being read from the stream of values flowing in.
And of course you need to edit the inputTopic and outputTopic (line 66-67) amongst a few other things.