I'm having this doubt for a while but now is time to ask around it. See the code below and having a huge items in $someVar
, for example 200 items:
// First approach
foreach($someVar as $item) {
$item = $em->getRepository('someEntity')->find($item['value']);
$em->remove($item);
$em->flush();
}
// Second approach
foreach($someVar as $item) {
$item = $em->getRepository('someEntity')->find($item['value']);
$em->remove($item);
}
$em->flush();
- Both calls will do the same? meaning delete records from the DB?
- At performance level, which one is best to use? (Doctrine behaves as a memory killer sometimes)
- If both approach are good, can I use the same per UPDATE queries too?
- If any of the queries fails for some reason, how I catch which one? And possibly the error given by Doctrine
Testing with real case
Having the good information provided on the answer a few doubts remain on my mind. Take a look to this piece of code:
foreach ($request->request->get( 'items' ) as $item) {
$items = $em->getRepository( 'AppBundle:FacturaProductoSolicitud' )->find( $item['value'] );
try {
$em->remove( $items );
$em->flush();
array_push($itemsRemoved, $item['value']);
$response['itemsRemoved'] = $itemsRemoved;
$response['success'] = true;
} catch ( \Exception $e ) {
dump( $e->getMessage() );
array_push($itemsNonRemoved, $item['value']);
$response['itemsNonRemoved'] = $itemsNonRemoved;
$response['success'] = false;
}
}
I'm using a try {} catch() {}
sentence here because I need to know which $item['value']
was deleted or not in order to add it to the proper array (see $itemsRemoved
and $itemsNonRemoved
) also the flush()
is executed for each loop, bad practice I know, but, getting out the flush
from the foreach loop and executing inside a try-catch, is there any way to get which $item['value']
was deleted or not? How?
Actually, Running flush() after every remove is an antipattern. Ideally, you should run it once at the end of all your queries.
For the most part, Doctrine 2 already takes care of proper transaction demarcation for you: All the write operations (INSERT/UPDATE/DELETE) are queued until EntityManager#flush() is invoked which wraps all of these changes in a single transaction.
However, you can wrap your queries in transactions if you want more consistency. In fact, it is encourage by Doctrine as you can read in its best practices.
Both calls will do the same? meaning delete records from the DB?
Yes though invoking remove on an entity does NOT cause an immediate SQL DELETE to be issued on the database. The entity will be deleted on the next invocation of EntityManager#flush() that involves that entity. This means that entities scheduled for removal can still be queried for and appear in query and collection results.
Therefore, flush inside a loop means lots of SQL queries and accesses to your database and the entities will be immediately deleted.
Flush outside the loop means one efficient transaction (one access to your DB) performed by Doctrine but the entities won't be actually deleted until the flush is invoked. The entities will only be marked as deleted.
At performance level, which one is best to use? (Doctrine behaves as a memory killer sometimes)
Without a doubt, flush outside the loop. It's best practice too. There might be cases in which you really need to perform a flush every time you persist/remove/update an entity, but very few.
If both approach are good, can I use the same per UPDATE queries too?
Same applies to update/persist. Try to avoid flush inside loop at all costs.
To wrap up, have a look at the documentation. It is really well explained.
If any of the queries fails for some reason, how I catch which one? And possibly the error given by Doctrine
You can always wrap your
flush
in try/catch blocks and elegantly capture exceptions thrown when queries fails.When using implicit transaction demarcation and an exception occurs during EntityManager#flush(), the transaction is automatically rolled back and the EntityManager closed.
More info about the topic here.
Update
In the code you present, if you use flush outside the loop, all remove operations will belong to the same transaction. This means that if any of them fails, an exception is thrown and a rollback is issued (all removed operations performed would be rollbacked and therefore not persisted on the DB).
For instance: imagine we have four items with ids 1,2,3,4,5,6 and suppose that removing 4 fails.
First option->Flush inside loop: 1,2,3 are deleted. 4 fails throws exception and ends.
Second option-> Flush outside loop: 4 fails, rollback, none are deleted and program ends.
If the behaviour you want to achieve is the one shown in case 1, one option could be the one you're using. However, it's really expensive in terms of performance.
Nevertheless, there are better solutions: for instance, you could use a combination of the preRemove/postRemove events to store the ids (or whatever value you want) of those entities which were successfully removed in the flush (though not persisted because of the rollback). You could store them, for instance, in a static array that belongs to the class (or use a singleton or whatever). Then in the catch clause of the exception, you could use that array to iterate and perform a delete operation on those items (flush outside the loop, of course).
You could then return the array in order to let the user know that you actually removed those entities and false because there was a problem in the deletion process.