Mallet converts training cases to binary format using the command import-file, e.g.
bin/mallet import-file --input cases.txt --output cases.mallet
How is this binary ".mallet" file then used? Is it streamed or is the whole file loaded into memory. If it is all loaded then this places a limit on the number of training cases based on available memory.
Is it possible to characterize the size of the .mallet file based on the the size of input cases file or number of input cases?