Memory leak with large Core Data batch insert in S

2019-01-14 13:38发布

问题:

I am inserting tens of thousands of objects into my Core Data entity. I have a single NSManagedObjectContext and I am calling save() on the managed object context every time I add an object. It works but while it is running, the memory keeps increasing from about 27M to 400M. And it stays at 400M even after the import is finished.

There are a number of SO questions about batch insert and everyone says to read Efficiently Importing Data, but it's in Objective-C and I am having trouble finding real examples in Swift that solve this problem.

回答1:

There are a few things you should change:

  • Create a separate NSPrivateQueueConcurrencyType managed object context and do your inserts asynchronously in it.
  • Don't save after inserting every single entity object. Insert your objects in batches and then save each batch. A batch size might be something like 1000 objects.
  • Use autoreleasepool and reset to empty the objects in memory after each batch insert and save.

Here is how this might work:

let managedObjectContext = NSManagedObjectContext(concurrencyType: NSManagedObjectContextConcurrencyType.PrivateQueueConcurrencyType)
managedObjectContext.persistentStoreCoordinator = (UIApplication.sharedApplication().delegate as! AppDelegate).persistentStoreCoordinator // or wherever your coordinator is

managedObjectContext.performBlock { // runs asynchronously

    while(true) { // loop through each batch of inserts

        autoreleasepool {
            let array: Array<MyManagedObject>? = getNextBatchOfObjects()
            if array == nil { break }
            for item in array! {
                let newObject = NSEntityDescription.insertNewObjectForEntityForName("MyEntity", inManagedObjectContext: managedObjectContext) as! MyManagedObject
                newObject.attribute1 = item.whatever
                newObject.attribute2 = item.whoever
                newObject.attribute3 = item.whenever
            }
        }

        // only save once per batch insert
        do {
            try managedObjectContext.save()
        } catch {
            print(error)
        }

        managedObjectContext.reset()
    }
}

Applying these principles kept my memory usage low and also made the mass insert faster.

Further reading

  • Efficiently Importing Data (old Apple docs link is broken. If you can find it, please help me add it.)
  • Core Data Performance
  • Core Data (General Assembly post)

Update

The above answer is completely rewritten. Thanks to @Mundi and @MartinR in the comments for pointing out a mistake in my original answer. And thanks to @JodyHagins in this answer for helping me understand and solve the problem.