I'm attempting to extract the weights and biases from a simple network built in PyTorch. My entire network is composed of nn.Linear layers. When I create a layer by calling nn.Linear(in_dim, out_dim)
, I expect the parameters that I get from calling model.parameters()
for that model to be of shape (in_dim, out_dim)
for the weight and (out_dim)
for the bias. However, the weights that come out of model.parameters()
are instead of shape (out_dim, in_dim)
.
The intention of my code is to be able to use matrix multiplication to perform a forward pass using only numpy, not any PyTorch. Because of the shape inconsistency, matrix multiplications throw an error. How can I fix this?
Here is my exact code:
class RNN(nn.Module):
def __init__(self, dim_input, dim_recurrent, dim_output):
super(RNN, self).__init__()
self.dim_input = dim_input
self.dim_recurrent = dim_recurrent
self.dim_output = dim_output
self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)
#There is a defined forward pass
model = RNN(12, 100, 6)
for i in model.parameters():
print(i.shape())
The output is:
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])
The output should, if I'm correct, be:
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])
What is my issue?