PyTorch: purpose of addmm function

2019-08-16 01:41发布

站内文章 / 前端开发

14 0

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

What is the purpose of the following PyTorch function (doc):

torch.addmm(beta=1, mat, alpha=1, mat1, mat2, out=None)

More specifically, is there any reason to prefer this function instead of just using

beta * mat + alpha * (mat1 @ mat2)

The addmm function is an optimized version of the equation beta*mat + alpha*(mat1 @ mat2). I ran some tests and timed their execution.

If beta=1, alpha=1, then the execution of both the statements (addmm and manual) is approximately the same (addmm is just a little faster), regardless of the matrices size.
If beta and alpha are not 1, then addmm is two times faster than the manual execution for smaller matrices (with total elements in order of 10⁵). But, if matrices are large (in order of 10⁶), the speedup seems negligible (39ms v/s 41ms)