batch size does not work for caffe with deploy.pro

2019-05-16 21:48发布

I'm trying to make my classification process a bit faster. I thought of increasing the first input_dim in my deploy.prototxt but that does not seem to work. It's even a little bit slower than classifying each image one by one.

deploy.prototxt

input: "data"  
input_dim: 128  
input_dim: 1  
input_dim: 120  
input_dim: 160  
... net description ...

python net initialization

net=caffe.Net( 'deploy.prototxt', 'model.caffemodel', caffe.TEST)
net.blobs['data'].reshape(128, 1, 120, 160)
transformer = caffe.io.Transformer({'data':net.blobs['data'].data.shape})
#transformer settings

python classification

images=[None]*128
for i in range(len(images)):
  images[i]=caffe.io.load_image('image_path', False)
for j in range(len(images)):
  net.blobs['data'].data[j,:,:,:] = transformer.preprocess('data',images[j])
out = net.forward()['prob']

I skipped some details, but the important stuff should be given. I tried different batch size, like 32, 64, ..., 1024 but all nearly the same. So my question is, if someone has an idea what I'm doing wrong or what needs to be changed? Thanks for help!

EDIT:
Some timing results, the avg-times are just the total-times devided by the processed images(1044).

Batch size: 1

2016-05-04 10:51:20,721 - detector - INFO - data shape: (1, 1, 120, 160)
2016-05-04 10:51:35,149 - main - INFO - GPU timings:
2016-05-04 10:51:35,149 - main - INFO - processed images: 1044
2016-05-04 10:51:35,149 - main - INFO - total-time: 14.43s
2016-05-04 10:51:35,149 - main - INFO - avg-time: 13.82ms
2016-05-04 10:51:35,149 - main - INFO - load-time: 8.31s
2016-05-04 10:51:35,149 - main - INFO - avg-load-time: 7.96ms
2016-05-04 10:51:35,149 - main - INFO - classify-time: 5.99s
2016-05-04 10:51:35,149 - main - INFO - avg-classify-time: 5.74ms

Batch size: 32

2016-05-04 10:52:30,773 - detector - INFO - data shape: (32, 1, 120, 160)
2016-05-04 10:52:45,135 - main - INFO - GPU timings:
2016-05-04 10:52:45,135 - main - INFO - processed images: 1044
2016-05-04 10:52:45,135 - main - INFO - total-time: 14.36s
2016-05-04 10:52:45,136 - main - INFO - avg-time: 13.76ms
2016-05-04 10:52:45,136 - main - INFO - load-time: 7.13s
2016-05-04 10:52:45,136 - main - INFO - avg-load-time: 6.83ms
2016-05-04 10:52:45,136 - main - INFO - classify-time: 7.13s
2016-05-04 10:52:45,136 - main - INFO - avg-classify-time: 6.83ms

Batch size: 128

2016-05-04 10:53:17,478 - detector - INFO - data shape: (128, 1, 120, 160)
2016-05-04 10:53:31,299 - main - INFO - GPU timings:
2016-05-04 10:53:31,299 - main - INFO - processed images: 1044
2016-05-04 10:53:31,299 - main - INFO - total-time: 13.82s
2016-05-04 10:53:31,299 - main - INFO - avg-time: 13.24ms
2016-05-04 10:53:31,299 - main - INFO - load-time: 7.06s
2016-05-04 10:53:31,299 - main - INFO - avg-load-time: 6.77ms
2016-05-04 10:53:31,299 - main - INFO - classify-time: 6.66s
2016-05-04 10:53:31,299 - main - INFO - avg-classify-time: 6.38ms

Batch size: 1024

2016-05-04 10:54:11,546 - detector - INFO - data shape: (1024, 1, 120, 160)
2016-05-04 10:54:25,316 - main - INFO - GPU timings:
2016-05-04 10:54:25,316 - main - INFO - processed images: 1044
2016-05-04 10:54:25,316 - main - INFO - total-time: 13.77s
2016-05-04 10:54:25,316 - main - INFO - avg-time: 13.19ms
2016-05-04 10:54:25,316 - main - INFO - load-time: 7.04s
2016-05-04 10:54:25,316 - main - INFO - avg-load-time: 6.75ms
2016-05-04 10:54:25,316 - main - INFO - classify-time: 6.63s
2016-05-04 10:54:25,316 - main - INFO - avg-classify-time: 6.35ms

1条回答
不美不萌又怎样
2楼-- · 2019-05-16 22:35

I'm pretty sure the problem is in line

for j in range(len(images)):
net.blobs['data'].data[j,:,:,:] =   transformer.preprocess('data',images[j])
out = net.forward()['prob']

Doing this will simply set the single image data from the last iteration of the for loop as the network's only input. Try stacking the N images (say stackedimages) beforehand and calling the line only once e.g

for j in range(len(images)):
stackedimages <- transformer.preprocess('data',images[j])

Then call,

net.blobs['data'].data[...] =   stackedimages
查看更多
登录 后发表回答