I'm writing a program in CUDA that makes a huge amount of calls to the sincos()
function, using double precision. I'm afraid this is one of the biggest bottlenecks of the code, and I cannot reduce the number of calls to the function.
Is there any decent approximation to sincos
in CUDA or in a library I can import? I am also quite concerned with the accuracy, so the better the approximation is, the happier my code will be.
I've also thought about building a lookup table or approximating the values with their taylor series, but I want some opinions before going down that road.