I am new to pytorch. I want to understand as to why we can't call the backward function on a variable containing a tensor of say size say [2,2]. And if we do want to call it on a variable containing tensor of say size say [2,2], we have to do that by first defining a gradient tensor and then calling the backward function on the variable containing the tensor w.r.t the defined gradients.
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
from the tutorial on
autograd
Basically to start the chain rule you need a gradient AT THE OUTPUT, to get it going. In the event the output is a scalar loss function ( which it usually is - normally you are beginning the backward pass at the loss variable ) , its an implied value of 1.0
from tutorial :
but maybe you only want to update a subgraph ( somewhere deep in the network) ... and the value of a
Variable
is a matrix of weights. Then you have to tell it where the begin. From one of their chief devs ( somewhere in the links )Yes, that's correct. We only support differentiation of scalar functions, so if you want to start backward form a non-scalar value you need to provide dout / dy
The gradients argument
https://discuss.pytorch.org/t/how-the-backward-works-for-torch-variable/907/8 ok explanation
Pytorch, what are the gradient arguments good explanation
http://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html tutorial