I know I can measure the execution time of a call to sess.run()
, but is it possible to get a finer granularity and measure the execution time of individual operations?
标签:
tensorflow
相关问题
- batch_dot with variable batch size in Keras
- How to use Reshape keras layer with two None dimen
- CV2 Image Error: error: (-215:Assertion failed) !s
- Why keras use “call” instead of __call__?
- How to conditionally scale values in Keras Lambda
相关文章
- tensorflow 神经网络 训练集准确度远高于验证集和测试集准确度?
- Tensorflow: device CUDA:0 not supported by XLA ser
- Numpy array to TFrecord
- conditional graph in tensorflow and for loop that
- How to downgrade to cuda 10.0 in arch linux?
- Apply TensorFlow Transform to transform/scale feat
- How to force tensorflow tensors to be symmetric?
- keras model subclassing examples
There is not yet a way to do this in the public release. We are aware that it's an important feature and we are working on it.
Since this is high up when googling for "Tensorflow Profiling", note that the current (late 2017, TensorFlow 1.4) way of getting the Timeline is using a ProfilerHook. This works with the MonitoredSessions in tf.Estimator where tf.RunOptions are not available.
For the comments of fat-lobyte under Olivier Moindrot's answer, if you want to gather the timeline over all sessions, you can change "
open('timeline.json', 'w')
" to "open('timeline.json', 'a')
".To update this answer, we do have some functionality for CPU profiling, focused on inference. If you look at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/benchmark you'll see a program you can run on a model to get per-op timings.
To profile TensorFlow sessions automatically you can use the StackImpact profiler. No need to instrument sessions or add any options. You just need to initialize the profiler:
Both execution time and memory profiles will be available in the Dashboard.
Detailed info in this article: TensorFlow Profiling in Development and Production Environments.
Disclaimer: I work for StackImpact.
You can extract this information using runtime statistics. You will need to do something like this (check the full example in the above-mentioned link):
Better than just printing it you can see it in tensorboard: