Why the loss function can be apply on different si

2020-04-20 21:46发布

For example, I have a net that take tensor [N, 7](N is the samples num) as input and tensor [N, 4] as output, the “4” represents the different classes’ probabilities.

And the training data’s labels are the form of tensor [N], from range 0 to 3(represent the ground-truth class).

Here’s my question, I’ve seen some demos, they directly apply the loss function on the output tensor and label tensor. I wonder why this can work, since they have different size, and there sizes seems don’t fit the “broadcasting semantics”.

Here’s the minimal demo.

import torch
import torch.nn as nn
import torch.optim as optim

if __name__ == '__main__':
    features = torch.randn(2, 7)
    gt = torch.tensor([1, 1])
    model = nn.Sequential(
        nn.Linear(7, 4),
        nn.ReLU(),
        nn.Linear(4, 4)
    )
    optimizer = optim.SGD(model.parameters(), lr=0.005)
    f = nn.CrossEntropyLoss()

    for epoch in range(1000):
        optimizer.zero_grad()
        output = model(features)
        loss = f(output, gt)
        loss.backward()
        optimizer.step()

标签: pytorch
1条回答
▲ chillily
2楼-- · 2020-04-20 22:06

In PyTorch the implementation is:

pytorch cross entropy

Link to the Documentation: https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLoss


So implementing this formula in pytorch you get:

import torch
import torch.nn.functional as F

output = torch.tensor([ 0.1998, -0.2261, -0.0388,  0.1457])
target = torch.LongTensor([1])

# implementing the formula above
print('manual  cross-entropy:', (-output[target] + torch.log(torch.sum(torch.exp(output))))[0])

# calling build in cross entropy function to check the result
print('pytorch cross-entropy:', F.cross_entropy(output.unsqueeze(0), target))

Output:

manual  cross-entropy: tensor(1.6462)
pytorch cross-entropy: tensor(1.6462)

I hope this helps and sorry for the confusion.

查看更多
登录 后发表回答