What are the use cases for Apache Beam and Apache Nifi? It seems both of them are data flow engines. In case both have similar use case, which of the two is better?
相关问题
- Why do Dataflow steps not start?
- Apache beam DataFlow runner throwing setup error
- Apply Side input to BigQueryIO.read operation in A
- Merge two JSON flowfile together in NiFi
- Apache NiFi: How to compare multiple rows in a csv
相关文章
- Apply TensorFlow Transform to transform/scale feat
- How to run dynamic second query in google cloud da
- How do I use MapElements and KV in together in Apa
- Beam/Google Cloud Dataflow ReadFromPubsub Missing
- Difference between beam.ParDo and beam.Map in the
- KafkaIO checkpoint - how to commit offsets to Kafk
- Apache NiFi - OutOfMemory Error: GC overhead limit
- Creating a Proper avro schema for timestamp record
Apache Beam is an abstraction layer for stream processing systems like Apache Flink, Apache Spark (streaming), Apache Apex, and Apache Storm. It lets you write your code against a standard API, and then execute the code using any of the underlying platforms. So theoretically, if you wrote your code against the Beam API, that code could run on Flink or Spark Streaming without any code changes.
Apache NiFi is a data flow tool that is focused on moving data between systems, all the way from very small edge devices with the use of MiNiFi, back to the larger data centers with NiFi. NiFi's focus is on capabilities like visual command and control, filtering of data, enrichment of data, data provenance, and security, just to name a few. With NiFi, you aren't writing code and deploying it as a job, you are building a living data flow through the UI that is taking effect with each action.
Stream processing platforms are often focused on computations involving joins of streams and windowing operations. Where as a data flow tool is often complimentary and used to manage the flow of data from the sources to the processing platforms.
There are actually several integration points between NiFi and stream processing systems... there are components for Flink, Spark, Storm, and Apex that can pull data from NiFi, or push data back to NiFi. Another common pattern would be to use MiNiFi + NiFi to get data into Apache Kafka, and then have the stream processing systems consume from Kafka.