Is it possible to process multiple streams in apac

2020-07-27 03:08发布

My Question is that, if we have two raw event streams i.e Smoke and Temperature and we want to find out if complex event i.e Fire has happened by applying operators to raw streams, can we do this in Flink?

I am asking this question because all the examples that I have seen till now for Flink CEP include only one input stream. Please correct me if I am wrong.

2条回答
Explosion°爆炸
2楼-- · 2020-07-27 03:19

I wonder if Strict chaining can be done (instead of followedBy if next can be used) because in the given stream there might be many events for a particular timestamp. So say for time t1-: a,b,c - these three events come and for time t2-: a2,b2,c2 comes to flink engine. So, I wonder how we get event(a).next(a2), because it might never be the case since series would be something like -: a b c a2 b2 c2

however, if CEP module processes events such that it considers one timestamp as a single event, then this make sense.

查看更多
兄弟一词,经得起流年.
3楼-- · 2020-07-27 03:27

Short Answer - Yes, you can read and process multiple streams and fire rules based on your event types from the different stream source.

Long answer - I had a somewhat similar requirement and My answer is based on the assumption that you are reading different streams from different kafka topics.

Read from different topics which stream different events in a single source:

FlinkKafkaConsumer010<BAMEvent> kafkaSource = new FlinkKafkaConsumer010<>(
        Arrays.asList("topicStream1", "topicStream2", "topicStream3"),
        new StringSerializerToEvent(),
        props);

kafkaSource.assignTimestampsAndWatermarks(new 
TimestampAndWatermarkGenerator());
DataStream<BAMEvent> events = env.addSource(kafkaSource)
        .filter(Objects::nonNull);

The serializer reads the data and parses them to a have a common format - For eg.

@Data
public class BAMEvent {
 private String keyid;  //If key based partitioning is needed
 private String eventName; // For different types of events
 private String eventId;  // Any other field you need
 private long timestamp; // For event time based processing 

 public String toString(){
   return eventName + " " + timestamp + " " + eventId + " " + correlationID;
 }

}

and after this, things are pretty straightforward, define the rules based on the event name and compare the event name for defining the rules (You can also define complex rules as follows) :

Pattern.<BAMEvent>begin("first")
        .where(new SimpleCondition<BAMEvent>() {
          private static final long serialVersionUID = 1390448281048961616L;

          @Override
          public boolean filter(BAMEvent event) throws Exception {
            return event.getEventName().equals("event1");
          }
        })
        .followedBy("second")
        .where(new IterativeCondition<BAMEvent>() {
          private static final long serialVersionUID = -9216505110246259082L;

          @Override
          public boolean filter(BAMEvent secondEvent, Context<BAMEvent> ctx) throws Exception {

            if (!secondEvent.getEventName().equals("event2")) {
              return false;
            }

            for (BAMEvent firstEvent : ctx.getEventsForPattern("first")) {
              if (secondEvent.getEventId = firstEvent.getEventId()) {
                return true;
              }
            }
            return false;
          }
        })
        .within(withinTimeRule);

I hope this gives you the idea to integrate one or more different streams together.

查看更多
登录 后发表回答