I'm trying to use the python dlib.train_shape_predictor
function to train using a very large set of images (~50,000).
I've created an xml file containing the necessary data, but it seems like train_shape_predictor loads all the referenced images into RAM before it starts training. This leads to the process getting terminated because it uses over 100gb of RAM. Even trimming down the data set uses over 20gb (machine only has 16gb physical memory).
Is there some way to get train_shape_predictor to load images on demand, instead of all at once?
I'm using python 3.7.2 and dlib 19.16.0 installed via pip on macOS.