OpenAI's REINFORCE and actor-critic example for reinforcement learning has the following code:
policy_loss = torch.cat(policy_loss).sum()
loss = torch.stack(policy_losses).sum() + torch.stack(value_losses).sum()
One is using torch.cat
, the other uses torch.stack
.
As far as my understanding goes, the doc doesn't give any clear distinction between them.
I would be happy to know the differences between the functions.
stack
cat
So if
A
andB
are of shape (3, 4),torch.cat([A, B], dim=0)
will be of shape (6, 4) andtorch.stack([A, B], dim=0)
will be of shape (2, 3, 4).