What is the difference of static Computational Gra

2020-05-18 16:53发布

问题:

When I was learning tensorflow, one basic concept of tensorflow was computational graphs, and the graphs was said to be static. And I found in Pytorch, the graphs was said to be dynamic. What's the difference of static Computational Graphs in tensorflow and dynamic Computational Graphs in Pytorch?

回答1:

Both frameworks operate on tensors and view any model as a directed acyclic graph (DAG), but they differ drastically on how you can define them.

TensorFlow follows ‘data as code and code is data’ idiom. In TensorFlow you define graph statically before a model can run. All communication with outer world is performed via tf.Session object and tf.Placeholder which are tensors that will be substituted by external data at runtime.

In PyTorch things are way more imperative and dynamic: you can define, change and execute nodes as you go, no special session interfaces or placeholders. Overall, the framework is more tightly integrated with Python language and feels more native most of the times. When you write in TensorFlow sometimes you feel that your model is behind a brick wall with several tiny holes to communicate over. Anyways, this still sounds like a matter of taste more or less.

However, those approaches differ not only in a software engineering perspective: there are several dynamic neural network architectures that can benefit from the dynamic approach. Recall RNNs: with static graphs, the input sequence length will stay constant. This means that if you develop a sentiment analysis model for English sentences you must fix the sentence length to some maximum value and pad all smaller sequences with zeros. Not too convenient, huh. And you will get more problems in the domain of recursive RNNs and tree-RNNs. Currently Tensorflow has limited support for dynamic inputs via Tensorflow Fold. PyTorch has it by-default.

Reference:

https://medium.com/towards-data-science/pytorch-vs-tensorflow-spotting-the-difference-25c75777377b

https://www.reddit.com/r/MachineLearning/comments/5w3q74/d_so_pytorch_vs_tensorflow_whats_the_verdict_on/



回答2:

Both TensorFlow and PyTorch allow specifying new computations at any point in time. However, TensorFlow has a "compilation" steps which incurs performance penalty every time you modify the graph. So TensorFlow optimal performance is achieved when you specify the computation once, and then flow new data through the same sequence of computations.

It's similar to interpreters vs. compilers -- the compilation step makes things faster, but also discourages people from modifying the program too often.

To make things concrete, when you modify the graph in TensorFlow (by appending new computations using regular API, or removing some computation using tf.contrib.graph_editor), this line is triggered in session.py. It will serialize the graph, and then the underlying runtime will rerun some optimizations which can take extra time, perhaps 200usec. In contrast, running an op in previously defined graph, or in numpy/PyTorch can be as low as 1 usec.



回答3:

In tensorflow you first have to define the graph, then you execute it. Once defined you graph is immutable: you can't add/remove nodes at runtime.

In pytorch, instead, you can change the structure of the graph at runtime: you can thus add/remove nodes at runtime, dynamically changing its structure.