Consider this code (wrapped inside a function):
$manager = $this->manager; // local ref
$q = $manager->createQuery('select c from VendorFeedBundle:Category c');
$iterableResult = $q->iterate();
$i = 0;
$batchSize = 500;
foreach($iterableResult as $row) {
$category = $row[0];
$struct = explode(' ' . $this->separator . ' ', $category->getPath());
unset($struct[count($struct) - 1]);
$path = implode(' ' . $this->separator . ' ', $struct);
if (!$parent = $this->repo->findOneBy(['path' => $path])) {
continue;
}
$category->setParent($parent);
// flush every x entities
if (($i % $batchSize) == 0) {
echo 'Flushing batch...' . "\n";
echo 'Memory: ' . $this->getReadableSize(memory_get_usage()) . "\n";
$manager->flush();
$manager->clear();
echo 'After batch...' . "\n";
echo 'Memory: ' . $this->getReadableSize(memory_get_usage()) . "\n";
}
++$i;
}
// flush the remaining
$manager->flush();
$manager->clear();
It logs the following in my terminal:
Creating tree...
Memory: 14.91 MB
Flushing batch...
Memory: 18.46 MB
After batch...
Memory: 18.79 MB
Flushing batch...
Memory: 21.01 MB
After batch...
Memory: 23.29 MB
Flushing batch...
Memory: 25.36 MB
After batch...
Memory: 27.87 MB
Flushing batch...
Memory: 29.88 MB
.... etc
The getReadAbleSize method is not leaking any variables to a global scope or anything. I've read and followed advice about doctrine2 bulk inserts/updates (batch processing): http://docs.doctrine-project.org/en/latest/reference/batch-processing.html
What am I doing wrong? 3~4 MB memory increase per 500 items seems like a (little) leak to me.
Sidenote: I need to update the items this way because my system is split up in 2 processes; first I insert the categories and secondly I update the parent relationship.
My category class is a basic Doctrin2 entity with a few Gedmo extensions added to it (tree, translateable, timestampable) See: http://pastie.org/private/oiiyf54zjuouhiqjsjislg
My complete script (which is iterating and updating categories): http://pastie.org/private/k5x240vr4taepczhqa4tva