I have access to Tesla K20c, I am running ResNet50 on CIFAR10 dataset...
Then I get the error as: THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu line=265 error=59 : device-side assert triggered
Traceback (most recent call last):
File "main.py", line 109, in <module>
train(loader_train, model, criterion, optimizer)
File "main.py", line 54, in train
optimizer.step()
File "/usr/local/anaconda35/lib/python3.6/site-packages/torch/optim/sgd.py", line 93, in step
d_p.add_(weight_decay, p.data)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu:265
How to resolve this error
相关问题
- The behavior of __CUDA_ARCH__ macro
- Trying to understand Pytorch's implementation
- How to convert Onnx model (.onnx) to Tensorflow (.
- Usage of anonymous functions in arrayfun with GPU
- Pytorch Convolutional Autoencoders
相关文章
- how to flatten input in `nn.Sequential` in Pytorch
- How to fix this strange error: “RuntimeError: CUDA
- PyTorch: Extract learned weights correctly
- RuntimeError: Expected object of backend CUDA but
- Running LSTM with multiple GPUs gets “Input and hi
- Data loading with variable batch size?
- Augmenting only the training set in K-folds cross
- How to import a trained SVM detector in OpenCV 2.4
In general, when encountering
cuda runtine error
s, it is advisable to run your program again using theCUDA_LAUNCH_BLOCKING=1
flag to obtain an accurate stack trace.In your specific case, the targets of your data were too high (or low) for the specified number of classes.
I have encountered this problem several times. And I find it to be an index issue. For example, if your ground truth label starts at 1: target = [1,2,3,4,5], then you should subtract 1 for every label, change it to: [0,1,2,3,4]. This solves my problem every time.