I need the compute the element wise multiplication of two vectors (Hadamard product) of complex numbers with NVidia CUBLAS. Unfortunately, there is no HAD operation in CUBLAS. Apparently, you can do this with the SBMV operation, but it is not implemented for complex numbers in CUBLAS. I cannot believe there is no way to achieve this with CUBLAS. Is there any other way to achieve that with CUBLAS, for complex numbers ?
I cannot write my own kernel, I have to use CUBLAS (or another standard NVIDIA library if it is really not possible with CUBLAS).
CUBLAS is based on the reference BLAS, and the reference BLAS has never contained a Hadamard product (complex or real). Hence CUBLAS doesn't have one either. Intel have added
v?Mul
to MKL for doing this, but it is non-standard and not in most BLAS implementations. It is the kind of operation that an old school fortran programmer would just write a loop for, so I presume it really didn't warrant a dedicated routine in BLAS.There is no "standard" CUDA library I am aware of which implements a Hadamard product. There would be the possibility of using CUBLAS GEMM or SYMM to do this and extracting the diagonal of the resulting matrix, but that would be horribly inefficient, both from a computation and storage stand point.
The Thrust template library can do this trivially using
thrust::transform
, for example:would iterate over each pair of inputs from the device pointers x and y and calculate z[i] = x[i] * y[i] (there is probably a couple of casts you need to make to compile that, but you get the idea). But that effectively requires compilation of CUDA code within your project, and apparently you don't want that.