my studies project is to develop a neural network to recognize text on license plates. Therefore, I found the ReId-dataset at https://medusa.fit.vutbr.cz/traffic/research-topics/general-traffic-analysis/holistic-recognition-of-low-quality-license-plates-by-cnn-using-track-annotated-data-iwt4s-avss-2017/. This dataset contains a bunch of images of number plates as well as the text of the license plates and was used by Spanhel et al. for a similar approach as the one I have in mind.
Example of a license plate there:
In the project I want to recognize only the license plate text, i.e. only "9B5 2145" and not the country acronym "CZ" and no advertisement text.
I downloaded the dataset and the labels csv-file to my local memory. So, I have the following folder structure: One mother directory for my whole project. This mother directory includes my data directory, where I stored the ReId dataset. This dataset includes several subdirectories, 4 directories with training data and 4 with test data, all of this subdirectories contain a number of images of license plates. The ReId dataset also contains the trainVal csv-file which is structured as follows (snippet of the actual sheet):
track_id is equal to the subdirectory of the ReID dataset. image_path is equal to the path to the image, in this case the image's name is 1_1. lp is the label of the license plate, so the actual license plate. train is a dummy variable, equal to one, if the image is used for training purposes and 0 for validation purposes.
Regarding this dataset, I got three main questions:
How do I read in this images properly? I tried to use something like this
from keras.preprocessing.image import ImageDataGenerator # create generator datagen = ImageDataGenerator() # prepare an iterators for each dataset train_it = datagen.flow_from_directory('data/train/', class_mode='binary') val_it = datagen.flow_from_directory('data/validation/', class_mode='binary') test_it = datagen.flow_from_directory('data/test/', class_mode='binary') # confirm the iterator works batchX, batchy = train_it.next() print('Batch shape=%s, min=%.3f, max=%.3f' % (batchX.shape, batchX.min(), batchX.max()))
But obviously Python did not find images belonging to any classes (side note: I used the correct paths). That is clear to me, because I did not assign any class to my data yet. So, my first question is: Do I have to do that? I don't think so.
How do I then read this images properly? I think, I have to get numpy arrays to work properly with this data.
How do I bring my images and the labels together? In my opinion, I think I have to merge the two datasets, don't I?
Thank you very much!