An optimizer typically run the same computation graph for many steps until convergence. Does tensorflow setup the graph at the beginning and reuse it for every step? What if I change the batch size during training? What if I make some minus change to the graph like changing the loss function? What if I made some major change to the graph? Does tensorflow pre-generate all possible graphs? Does tensorflow know how to optimize the entire computation when the graph changes?
相关问题
- batch_dot with variable batch size in Keras
- How to use Reshape keras layer with two None dimen
- CV2 Image Error: error: (-215:Assertion failed) !s
- Why keras use “call” instead of __call__?
- How to conditionally scale values in Keras Lambda
相关文章
- tensorflow 神经网络 训练集准确度远高于验证集和测试集准确度?
- Tensorflow: device CUDA:0 not supported by XLA ser
- Numpy array to TFrecord
- conditional graph in tensorflow and for loop that
- How to downgrade to cuda 10.0 in arch linux?
- Apply TensorFlow Transform to transform/scale feat
- How to force tensorflow tensors to be symmetric?
- keras model subclassing examples
TensorFlow exposes only one graph that is visible to the user, namely the one specified by the user. The user can run the graph with
Session.run()
or by callingTensor.eval()
on some tensor. ASession.run()
call can specify some tensors to be fed and others to be fetched. Depending on what needs to be fetched, the TensorFlow runtime could be internally constructing and optimizing various data structures, including a pruned version of the user visible graph. However, this internal graph is not visible to the user in anyway. No, TensorFlow doesn't 'pre-generate' all possible graphs. Yes, TensorFlow does perform extensive optimizations on the computation graph. And finally, changing the batch size of a tensor that is fed doesn't change the structure of the graph.As keveman says, from the client's perspective there is a single TensorFlow graph. In the runtime, there can be multiple pruned subgraphs that contain just the nodes that are necessary to compute the values
t1
,t2
etc. that you fetch when callingsess.run([t1, t2, ...])
.If you call
sess.run([t1, t2])
will prune the overall graph (sess.graph
) down to the subgraph required to compute those values: i.e. the operations that producet1
andt2
and all of their antecedents. If you subsequently callsess.run([t3, t4])
, the runtime will prune the graph down to the subgraph required to computet3
andt4
. Each time you pass a new combination of values to fetch, TensorFlow will compute a new pruned graph and cache it—this is why the firstsess.run()
can be somewhat slower than subsequent ones.If the pruned graphs overlap, TensorFlow will reuse the "kernel" for the ops that are shared. This is relevant because some ops (e.g.
tf.Variable
andtf.FIFOQueue
) are stateful, and their contents can be used in both pruned graphs. This allows you, for example, to initialize your variables with one subgraph (e.g.sess.run(tf.initialize_all_variables())
), train them with another (e.g.sess.run(train_op)
), and evaluate your model with a third (e.g.sess.run(loss, feed_dict={x: ...})
). It also lets you enqueue elements to a queue with one subgraph, and dequeue them with another, which is the foundation of the input pipelines.