Pytorch loss function dimensions do not match

I'm trying to run word embeddings using batch training, as shown below.

def forward(self, inputs):
    print(inputs.shape)
    embeds = self.embeddings(inputs)
    print(embeds.shape)
    out = self.linear1(embeds)
    print(out.shape)
    out = self.activation_function1(out)
    print(out.shape)
    out = self.linear2(out).cuda()
    print(out.shape)
    out = self.activation_function2(out)
    print(out.shape)
    return out.cuda()

Here, I'm using context size 4, batch size 32, embedding size 50, hidden layer size 64, vocab size 9927

The output of the "shape" functions is

print(inputs.shape) ----> torch.Size([4, 32])

print(embeds.shape) ----> torch.Size([4, 32, 50])

print(out.shape) ----> torch.Size([4, 32, 64])

print(out.shape) ----> torch.Size([4, 32, 9927])

Are the shapes of these correct? I'm quite confused.

Also, when I train, it returns an error:

def train(epoch):
  model.train()
  for batch_idx, (data, target) in enumerate(train_loader, 0):
    optimizer.zero_grad()
    output = model(torch.stack(data))
    loss = criterion(output, target)
    loss.backward()
    optimizer.step()

I'm getting an error in the line "loss = criterion(output, target)". It says "Expected input batch_size (4) to match target batch_size (32)." Are my shapes for the "forward" function correct? I'm not that familiar with batch training. How do I make the dimensions match?

-------EDIT: Posting init code below -----

  def __init__(self, vocab_size, embedding_dim):
    super(CBOW, self).__init__()
    self.embeddings = nn.Embedding(vocab_size, embedding_dim)
    self.linear1 = nn.Linear(embedding_dim, 64)
    self.activation_function1 = nn.ReLU()
    self.linear2 = nn.Linear(64, vocab_size)
    self.activation_function2 = nn.LogSoftmax(dim = -1)

标签： python neural-network pytorch

1条回答

傲

2楼-- · 2019-06-14 19:51

torch.nn.Linear's forward method needs batch size as first argument.

You are supplying it as second (first being timesteps), use permute(1, 0, 2) to make them first.

Furthermore, linear layers usually take 2D input, with first being batch and second being dimension of input. Yours is 3d because of words (I assume), maybe you want to use recurrent neural networks (e.g. torch.nn.LSTM)?

0人赞添加讨论(0) 举报

Pytorch loss function dimensions do not match

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间