Binary Image Classification with CNN - best practi

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Update the question so it focuses on one problem only by editing this post.

Closed 2 years ago.

Say, I want to train a CNN to detect whether an image is a car or not.

What are some best practices or methods to choosing the "Not-Car" dataset?

Because this dataset could potentially be infinite (basically anything that is not a car) - is there a guideline on how big the dataset needs to be? Should they contain objects which are very similar to cars, but are not (planes, boats, etc.)?

标签： machine-learning neural-network computer-vision deep-learning conv-neural-network

2条回答

兄弟一词,经得起流年.

2楼-- · 2020-03-26 06:42

I think the negative sample should be selected depend on the occasion your model works on. If your model works on the street as a car detector, the reasonable negative sample should be street road background, trees, pedestrian,and other vehicle that commone in street. So i think there is not a universal negative sample select rules but only depend on your need.

0人赞添加讨论(0) 举报

再贱就再见

3楼-- · 2020-03-26 07:00

Like in all of supervised machine learning, the training set should reflect the real distribution that the model is going to work with. Neural network is basically a function approximator. Your actual goal is to approximate the real-world distribution, but in practice it's only possible to get the sample from it, and this sample is the only thing a neural network will see. For any input way outside of the training manifold, the output will be a just a guess (see also this discussion on AI.SE).

So when choosing a negative dataset, the first question you should answer is: What will be the likely use-case of this model? E.g., if you're building an app for a smartphone, then the negative sample should probably include street views, pictures of buildings and stores, people, indoor environment, etc. It's unlikely that the image from the smartphone camera will be a wild animal or abstract painting, i.e., it's an improbable input in your real distribution.

Including images that look like a positive class (trucks, airplanes, boats, etc) is a good idea, because the low-conv-layer features (edges, corners) will be very similar and it's important that the neural network learned important high-level features correctly.

In general, I'd use 5-10x more negative images that positive ones. CIFAR-10 is a good starting point: out of 50000 training images 5000 are the cars, 5000 are the planes, etc. In fact, building a 10-class classifier is not a bad idea. In this case, you'll transform this CNN to a binary classifier by thresholding its certainty that the inferred class is a car. Anything that the CNN isn't certain about will be interpreted as not a car.

0人赞添加讨论(0) 举报

Binary Image Classification with CNN - best practi

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间