What is an efficient way to Merge two iOS Core Dat

2019-01-22 23:02发布

问题:

In our app under development we are using Core Data with a sqlite backing store to store our data. The object model for our app is complex. Also, the total amount of data served by our app is too large to fit into an iOS (iPhone/iPad/iPod Touch) app bundle. Because of the fact that our users are, typically, interested only in a subset of the data, we've partitioned our data in such a way that the app ships with a subset (albeit, ~100 MB) of the data objects in the app bundle. Our users have the option of downloading additional data objects (of size ~5 MB to 100 MB) from our server after they pay for the additional contents through iTunes in-app purchases.   The incremental data files (existing in sqlite backing stores) use the same xcdatamodel version as the data that ships with the bundle; there is zero changes to the object model. The incremental data files are downloaded from our server as a gzipped sqlite files. We don't want to bloat our app bundle by shipping the incremental contents with the app. Also, we don't want to rely on queries over webservice (because of the complex data model).   We've tested the download of the incremental sqlite data from our server. We have been able to add the downloaded data store to the app's shared persistentStoreCoordinator.  

{
       NSError *error = nil;
       NSDictionary *options = [NSDictionary dictionaryWithObjectsAndKeys:
                                [NSNumber numberWithBool:YES], NSMigratePersistentStoresAutomaticallyOption, 
                                [NSNumber numberWithBool:YES], NSInferMappingModelAutomaticallyOption, nil];

       if (![__persistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType configuration:nil URL:defaultStoreURL options:options error:&error])
       {            
           NSLog(@"Failed with error:  %@", [error localizedDescription]);
           abort();
       }    

       // Check for the existence of incrementalStore
       // Add incrementalStore
       if (incrementalStoreExists) {
           if (![__persistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType configuration:nil URL:incrementalStoreURL options:options error:&error])
           {            
               NSLog(@"Add of incrementalStore failed with error:  %@", [error localizedDescription]);
               abort();
           }    
       }
 }

  However, there are two problems with doing it this way.

  1. Data fetch results (e.g., with NSFetchResultController) show up with the data from the incrementalStoreURL appended to the end of the data from the defaultStoreURL.
  2. Some of the objects are duplicated. There are many entities with read-only data in our data model; these get duplicated when we add the second persistentStore to the persistentStoreCoordinator.

Ideally, we would like Core Data to merge the object graphs from the two persistent stores into one (there are no shared relationships between data from the two stores at the time of the data download). Also, we would like to remove the duplicate objects. Searching the web, we saw a couple of questions by people attempting to do the same thing we are doing--such as this answer and this answer. We've read Marcus Zarra's blog on importing large data sets in Core Data. However, none of the solutions we've seen worked for us. We don't want to manually read and save the data from the incremental store to the default store as we think this will be very slow and error prone on the phone. Is there a more efficient way of doing the merge?

We've attempted to solve the problem by implementing a manual migration as follows. However, we haven't been able to successfully get the merge to happen. We are not really clear on the solution suggested by answers 1 and 2 referenced above. Marcus Zarra's blog addressed some of the issues we had at the outset of our project importing our large dataset into iOS.

{
       NSError *error = nil;
       NSDictionary *options = [NSDictionary dictionaryWithObjectsAndKeys:
                                [NSNumber numberWithBool:YES], NSMigratePersistentStoresAutomaticallyOption, 
                                [NSNumber numberWithBool:YES], NSInferMappingModelAutomaticallyOption, nil];        

       NSMigrationManager *migrator = [[NSMigrationManager alloc] initWithSourceModel:__managedObjectModel destinationModel:__managedObjectModel];
       if (![migrator migrateStoreFromURL:stateStoreURL
                                type:NSSQLiteStoreType 
                             options:options 
                    withMappingModel:nil
                    toDestinationURL:destinationStoreURL 
                     destinationType:NSSQLiteStoreType 
                  destinationOptions:nil 
                               error:&error])
       {
           NSLog(@"%@", [error userInfo]);
           abort();
       }
}

  It seems that the author of answer 1 ended up reading his data from the incremental store and saving to the default store. Perhaps, we've misunderstood the solution suggested by both articles 1 & 2. The size of our data may preclude us from manually reading and re-inserting our incremental data into the default store. My question is: what is the most efficient way to get the object graphs from two persistentStores (that have the same objectModel) to merge into one persistentStore?

Automatic migration works pretty well when we add new entity attributes to object graphs or modify relationships. Is there a simple solution to merging similar data into the same persistent store that will be resilient enough to stop and resume--as automatic migration is done?

回答1:

After several attempts, I've figured out how to make this work. The secret is to first create the incremental store data without any data for the read-only entities. Without leaving read-only data out of the incremental stores, the entities instances for these would get duplicated after the data migration and merge. Hence, the incremental stores should be created without these read-only entities. The default store will be the only store that has them.

For example, I had entities "Country" and "State" in my data model. I needed to have only one instance of Country and State in my object graph. I kept these entities out of incremental stores and created them only in the default store. I used Fetched Properties to loosely link my main object graph to these entities. I created the default store with all the entity instances in my model. The incremental stores either didn't have the read-only entities (i.e., Country and State in my case) to start with or deleted them after data creation is completed.

Next step is to add the incremental store to it's own persistentStoreCoordinator (not the same as the coordinator for the default store that we want to migrate all contents to) during application startup.

The final step is to call migratePersistentStore method on the incremental store to merge its data to the main (i.e., default) store. Presto!

The following code fragment illustrates the last two steps I mentioned above. I did these steps to make my setup to merge incremental data into a main data store to work.

{
    NSError *error = nil;
    NSDictionary *options = [NSDictionary dictionaryWithObjectsAndKeys:
    [NSNumber numberWithBool:YES], NSMigratePersistentStoresAutomaticallyOption, 
    [NSNumber numberWithBool:YES], NSInferMappingModelAutomaticallyOption, nil];

    if (![__persistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType configuration:nil URL:defaultStoreURL options:options error:&error])
    {            
        NSLog(@"Failed with error:  %@", [error localizedDescription]);
        abort();
    }    

    // Check for the existence of incrementalStore
    // Add incrementalStore
    if (incrementalStoreExists) {

        NSPersistentStore *incrementalStore = [_incrementalPersistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType configuration:nil URL:incrementalStoreURL options:options error:&error];
        if (!incrementalStore)
        {
            NSLog(@"Unresolved error %@, %@", error, [error userInfo]);
            abort();
        }    

        if (![_incrementalPersistentStoreCoordinator migratePersistentStore:incrementalStore
            toURL:_defaultStoreURL
            options:options
            withType:NSSQLiteStoreType
            error:&error]) 
        {
            NSLog(@"%@", [error userInfo]);
            abort();

        }

        // Destroy the store and store coordinator for the incremental store
        [_incrementalPersistentStoreCoordinator removePersistentStore:incrementalStore error:&error];
        incrementalPersistentStoreCoordinator = nil;
        // Should probably delete the URL from file system as well
        //
    }
}


回答2:

The reason your migration isn't working is because the managed object model is identical.

Technically, you're talking about "data migration" not "schema migration". CoreData's migration API is designed for schema migration, that is handling changes to the managed object model.

As far as transferring data from one store to another you're kind of on your own. CoreData can help you be efficient by using batching and fetch limits on your fetch requests, but you need to implement the logic yourself.

It sounds like you have two persistent stores, a big one and a small one. It would be most efficient to load the small one and analyze it, discovering the set of primary keys or unique identifiers you need to query for in the larger store.

You could then de-dupe easily by simply querying the larger store for those identifiers.

The documentation for NSFetchRequest has the API for scoping your queries:

https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/CoreDataFramework/Classes/NSFetchRequest_Class/NSFetchRequest.html



回答3:

You don't need any migration - migration is designed to bring changes in NSManagedObjectModel,not in data itself.

What you really need is a Pesristent Store Coordinator managing two Persistent Stores. That's kinda tricky, but not too difficult, really.

There's a similar question, that can explain you,what you really need to do. Can multiple (two) persistent stores be used with one object model, while maintaining relations from one to the other?

Here's a good arcticle by Marcus Zarra

http://www.cimgf.com/2009/05/03/core-data-and-plug-ins/