What is a dataflow programming language? Why use it? and are there any benefits to it?
相关问题
- How to create a personalised WindowFn in google da
- How to load data in nested array using dataflow
- Beam/Dataflow 2.2.0 - extract first n elements fro
- Beam / Dataflow Custom Python job - Cloud Storage
- Apache Beam MinimalWordcount example with Dataflow
相关文章
- What is a good motivating example for dataflow con
- Confusion between Behavioural and Dataflow model P
- TPL Dataflow vs plain Semaphore
- How to create groups of N elements from a PCollect
- What is the difference between Dataflow programmin
- How can one create a data flow graph (DFG/SDFG) fo
- How to Deserialising Kafka AVRO messages using Apa
- Share state among operators in Flink
Dataflow programming languages propose to isolate some local behaviors in so called "actors", that are supposed to run in parallel and exchange data through point-to-point channels. There is no notion of central memory (both for code and data) unlike Von Neumann model of computers.
These actors consume data tokens on their inputs and produce new data on their outputs.
This definition does not impose the means to run this in practice. However, the production/consumption of data needs to be analyzed with care : for example, if an actor B does not consume at the same speed as the actor A that produce the data, then a potentially unbounded memory (fifo) is required between them. Many other problems can arise like deadlocks.
In many cases, this analysis will fail because the interleaving of the internal behaviors is intractable (beyond reach of today formal methods).
Despite this, dataflow programming languages remain attractive in many domains :
Mozart has support for dataflow-like synchronization, and it does have some commercial applications. You could also argue that make is a dataflow programming language
You could try Cameleon: www.shinoe.org/cameleon which seems to be simple to use. It's a graphical language for functional programming which has a data(work)-flow approach.
Its written in C++ but can call any type of local or distant programs written in any programming language.
It has a multi-scale approach and seems to be turing complete (this is a petri net extension).
Dataflow programming languages are ones that focus on the state of the program and cause operations to occur according to any change in the state. Dataflow programming languages are inherently parallel, because the operations rely on inputs that when met will cause the operation to execute. This means unlike a normal program where one operation is followed by the next operation, in a dataflow program operations will execute as long as the inputs are met and thus there is no set order.
Often dataflow programming languages use a large hashtable where the keys are the data of the program and the values of the table are pointers to the operations of the program. This makes multicore programs easier to create in a dataflow programming language, since each core would only need the hashtable to work.
A common example of a dataflow programming language is a spread sheet program which has columns of data that are affected by other columns of data. Should the data in one column change, other data in the other columns will probably change with it. Although the spread sheet program is the most common example of a dataflow programming language, most of them tend to be graphical languages.
Many ETL tools are also in this realm. The dataflow tasks in MS SSIS are a good example. Graphical tool in this case.
Excel (and other spreadsheets) are essentially dataflow languages. Dataflow languages are a lot like functional programming languages, except that the values at the leaves of the whole program graph are not values at all, but variables (or value streams), so that when they change, the changes ripple and flow up the graph.