Apache Beam number of times a pane is fired with e

2019-07-25 12:01发布

In a streaming beam pipeline, a trigger is set to be

Window.into(FixedWindows.of(Duration.standardHours(1)))
              .triggering(AfterWatermark
                            .pastEndOfWindow()
                            .withEarlyFirings(AfterProcessingTime
                                    .pastFirstElementInPane()
                                    .plusDelayOf(Duration.standardMinutes(15))))
              .withAllowedLateness(Duration.standardHours(1))
              .accumulatingFiredPanes())
  1. If there's no new data between the early firing (15 minutes after the first element of the current window) and the watermark, will there be another firing at the end of the watermark?

  2. If yes, under the same scenario, will there be another firing at the end of the watermark if accumulatingFiredPanes is changed to discardingFiredPanes?

2条回答
萌系小妹纸
2楼-- · 2019-07-25 12:07
  1. Yes. There should always be a firing when the watermark passes the end of the window. The early firing panes will be marked as early, and the watermark pane will be marked as on time.

  2. Yes, currently we always guarantee an on_time pane, which means there will be a firing at the end of the watermark.

查看更多
爷、活的狠高调
3楼-- · 2019-07-25 12:07

For #2, you can set the Window.ClosingBehavior as second parameter to withAllowedLateness. There are two variants:

  • FIRE_ALWAYS
  • FIRE_IF_NON_EMPTY

See https://beam.apache.org/releases/javadoc/2.6.0/org/apache/beam/sdk/transforms/windowing/Window.ClosingBehavior.html

查看更多
登录 后发表回答