get wrong result when caculating on GPU (python3.5

2019-09-20 23:26发布

I want to get the sum of different parts of an array. I run my code. and find two problems from what was printed.

pro1:

Described in detail here. It has been solved. Maybe it's not a real problem.

pro2:

In my code, I gived different value to sbuf[0,2], sbuf[1,2], sbuf[2,2] and sbuf[0,3], sbuf[1,3], sbuf[2,3].

But find that after cuda.syncthreads(), the values bacame same between sbuf[0,2] and sbuf[0,3], sbuf[1,2] and sbuf[1,3], sbuf[2,2] and sbuf[2,3].

It directly lead to the values of Xi_s, Xi1_s and Yi_s wrong.

These are my guesses according to what was printed inside the kernel.

@talonmies said relying on print statements inside kernels like this is dangerous.

So I want to know if it has an useful way to debug my code instead of printing statements inside kernels.

    ...

@cuda.jit
def calcu_T(D, T):
  ...

                    if bx==1 and tx==1:
                        print('5,c_x,c_y,L,c_index,bx,tx,ty,sbuf[0,ty],sbuf[1,ty],sbuf[2,ty],',c_x,',',c_y,',',L,',',c_index,',',bx,',',tx,',',ty,',',sbuf[0,ty],',',sbuf[1,ty],',',sbuf[2,ty])

                    cuda.syncthreads()

                    if bx==1 and tx==1:
                        print('1,c_x,c_y,L,c_index,bx,tx,ty,sbuf[0,ty],sbuf[1,ty],sbuf[2,ty],',c_x,',',c_y,',',L,',',c_index,',',bx,',',tx,',',ty,',',sbuf[0,ty],',',sbuf[1,ty],',',sbuf[2,ty])

                     ...

1条回答
太酷不给撩
2楼-- · 2019-09-20 23:53

As @talonmies said, printing statements inside kernels is not a good choice for debugging. If someone has the same problem, this documentation is helpful, and more you should learn is pdb, especially the debugger commands,such as 'p', 'c'.

查看更多
登录 后发表回答