I am trying to get/trace the gradient of a variable using pytorch, where I have that variable, pass it to a first function that looks for some minimum value of some other variable, then the output of the first function is inputted to a second function, and the whole thing repeats multiple times.
Here is my code:
import torch
def myFirstFunction(parameter_current_here):
optimalValue = 100000000000000
Optimal = 100000000000000
for j in range(2, 10):
i = torch.ones(1, requires_grad=True)*j
with torch.enable_grad():
optimalValueNow = i*parameter_current_here.sum()
if (optimalValueNow < optimalValue):
optimalValue = optimalValueNow
Optimal = i
return optimalValue, Optimal
def mySecondFunction(Current):
with torch.enable_grad():
y = (20*Current)/2 + (Current**2)/10
return y
counter = 0
while counter < 5:
parameter_current = torch.randn(2, 2, requires_grad=True)
outputMyFirstFunction = myFirstFunction(parameter_current)
outputmySecondFunction = mySecondFunction(outputMyFirstFunction[1])
outputmySecondFunction.backward()
print("outputMyFirstFunction after backward:",
outputMyFirstFunction)
print("outputmySecondFunction after backward:",
outputmySecondFunction)
print("parameter_current Gradient after backward:",
parameter_current.grad)
counter = counter + 1
The parameter_current.grad is none for all iterations when it obviously shouldn't be none. What am I doing wrong? And how can I fix it?
Your help on this would be highly appreciated. Thanks a lot!
Aly
I'm guessing the problem is the
with torch.enable_grad():
statements. After to exited thewith
statement, thetorch.enable_grad()
no longer applies and torch will clear the grads after the functions are run.Since it is not really clear to me what you actually want to archive, besides computing gradients for
parameter_current
, I just focus on describing why it doesn't work and what you can do to acutally compute gradients.I've added some comments in the code to make it more clear what the problem is.
But in short the problem is that your
parameter_current
is not part of the computation of your loss resp. the tensor you callbackward()
on which isoutputmySecondFunction
.So currently you are only computing gradients for
i
as you have setrequires_grad=True
for it.Please check the comments, for detailes:
So if you want to compute gradients for
parameter_current
you simply have to make sure it is part of the computation of the tensor you callbackward()
on, you can do so for example by changing:to:
Will have this effect, as soon as you change it you will get gradients for
parameter_current
!I hope it helps!
Full working code:
Output: