How to Sync iPhone Core Data with web server, and

2019-01-01 07:23发布

问题:

I have been working on a method to sync core data stored in an iPhone application between multiple devices, such as an iPad or a Mac. There are not many (if any at all) sync frameworks for use with Core Data on iOS. However, I have been thinking about the following concept:

  1. A change is made to the local core data store, and the change is saved. (a) If the device is online, it tries to send the changeset to the server, including the device ID of the device which sent the changeset. (b) If the changeset does not reach the server, or if the device is not online, the app will add the change set to a queue to send when it does come online.
  2. The server, sitting in the cloud, merges the specific change sets it receives with its master database.
  3. After a change set (or a queue of change sets) is merged on the cloud server, the server pushes all of those change sets to the other devices registered with the server using some sort of polling system. (I thought to use Apple\'s Push services, but apparently according to the comments this is not a workable system.)

Is there anything fancy that I need to be thinking about? I have looked at REST frameworks such as ObjectiveResource, Core Resource, and RestfulCoreData. Of course, these are all working with Ruby on Rails, which I am not tied to, but it\'s a place to start. The main requirements I have for my solution are:

  1. Any changes should be sent in the background without pausing the main thread.
  2. It should use as little bandwidth as possible.

I have thought about a number of the challenges:

  1. Making sure that the object IDs for the different data stores on different devices are attached on the server. That is to say, I will have a table of object IDs and device IDs, which are tied via a reference to the object stored in the database. I will have a record (DatabaseId [unique to this table], ObjectId [unique to the item in the whole database], Datafield1, Datafield2), the ObjectId field will reference another table, AllObjects: (ObjectId, DeviceId, DeviceObjectId). Then, when the device pushes up a change set, it will pass along the device Id and the objectId from the core data object in the local data store. Then my cloud server will check against the objectId and device Id in the AllObjects table, and find the record to change in the initial table.
  2. All changes should be timestamped, so that they can be merged.
  3. The device will have to poll the server, without using up too much battery.
  4. The local devices will also need to update anything held in memory if/when changes are received from the server.

Is there anything else I am missing here? What kinds of frameworks should I look at to make this possible?

回答1:

I suggest carefully reading and implementing the sync strategy discussed by Dan Grover at iPhone 2009 conference, available here as a pdf document.

This is a viable solution and is not that difficult to implement (Dan implemented this in several of its applications), overlapping the solution described by Chris. For an in-depth, theoretical discussion of syncing, see the paper from Russ Cox (MIT) and William Josephson (Princeton):

File Synchronization with Vector Time Pairs

which applies equally well to core data with some obvious modifications. This provides an overall much more robust and reliable sync strategy, but requires more effort to be implemented correctly.

EDIT:

It seems that the Grover\'s pdf file is no longer available (broken link, March 2015). UPDATE: the link is available through the Way Back Machine here

The Objective-C framework called ZSync and developed by Marcus Zarra has been deprecated, given that iCloud finally seems to support correct core data synchronization.



回答2:

I\'ve done something similar to what you\'re trying to do. Let me tell you what I\'ve learned and how I did it.

I assume you have a one-to-one relationship between your Core Data object and the model (or db schema) on the server. You simply want to keep the server contents in sync with the clients, but clients can also modify and add data. If I got that right, then keep reading.

I added four fields to assist with synchronization:

  1. sync_status - Add this field to your core data model only. It\'s used by the app to determine if you have a pending change on the item. I use the following codes: 0 means no changes, 1 means it\'s queued to be synchronized to the server, and 2 means it\'s a temporary object and can be purged.
  2. is_deleted - Add this to the server and core data model. Delete event shouldn\'t actually delete a row from the database or from your client model because it leaves you with nothing to synchronize back. By having this simple boolean flag, you can set is_deleted to 1, synchronize it, and everyone will be happy. You must also modify the code on the server and client to query non deleted items with \"is_deleted=0\".
  3. last_modified - Add this to the server and core data model. This field should automatically be updated with the current date and time by the server whenever anything changes on that record. It should never be modified by the client.
  4. guid - Add a globally unique id (see http://en.wikipedia.org/wiki/Globally_unique_identifier) field to the server and core data model. This field becomes the primary key and becomes important when creating new records on the client. Normally your primary key is an incrementing integer on the server, but we have to keep in mind that content could be created offline and synchronized later. The GUID allows us to create a key while being offline.

On the client, add code to set sync_status to 1 on your model object whenever something changes and needs to be synchronized to the server. New model objects must generate a GUID.

Synchronization is a single request. The request contains:

  • The MAX last_modified time stamp of your model objects. This tells the server you only want changes after this time stamp.
  • A JSON array containing all items with sync_status=1.

The server gets the request and does this:

  • It takes the contents from the JSON array and modifies or adds the records it contains. The last_modified field is automatically updated.
  • The server returns a JSON array containing all objects with a last_modified time stamp greater than the time stamp sent in the request. This will include the objects it just received, which serves as an acknowledgment that the record was successfully synchronized to the server.

The app receives the response and does this:

  • It takes the contents from the JSON array and modifies or adds the records it contains. Each record get set a sync_status of 0.

I hope that helps. I used the word record and model interchangeably, but I think you get the idea. Good luck.



回答3:

If you are still looking for a way to go, look into the Couchbase mobile. This basically does all you want. (http://www.couchbase.com/nosql-databases/couchbase-mobile)



回答4:

Similar like @Cris I\'ve implemented class for synchronization between client and server and solved all known problems so far (send/receive data to/from server, merge conflicts based on timestamps, removed duplicate entries in unreliable network conditions, synchronize nested data and files etc .. )

You just tell the class which entity and which columns should it sync and where is your server.

M3Synchronization * syncEntity = [[M3Synchronization alloc] initForClass: @\"Car\"
                                                              andContext: context
                                                            andServerUrl: kWebsiteUrl
                                             andServerReceiverScriptName: kServerReceiverScript
                                              andServerFetcherScriptName: kServerFetcherScript
                                                    ansSyncedTableFields:@[@\"licenceNumber\", @\"manufacturer\", @\"model\"]
                                                    andUniqueTableFields:@[@\"licenceNumber\"]];


syncEntity.delegate = self; // delegate should implement onComplete and onError methods
syncEntity.additionalPostParamsDictionary = ... // add some POST params to authenticate current user

[syncEntity sync];

You can find source, working example and more instructions here: github.com/knagode/M3Synchronization.



回答5:

Notice user to update data via push notification. Use a background thread in the app to check the local data and the data on the cloud server,while change happens on server,change the local data,vice versa.

So I think the most difficult part is to estimate data in which side is invalidate.

Hope this can help u



回答6:

I have just posted the first version of my new Core Data Cloud Syncing API, known as SynCloud. SynCloud has a lot of differences with iCloud because it allows for Multi-user sync interface. It is also different from other syncing api\'s because it allows for multi-table, relational data.

Please find out more at http://www.syncloudapi.com

Build with iOS 6 SDK, it is very up to date as of 9/27/2012.



回答7:

I think a good solution to the GUID issue is \"distributed ID system\". I\'m not sure what the correct term is, but I think that\'s what MS SQL server docs used to call it (SQL uses/used this method for distributed/sync\'ed databases). It\'s pretty simple:

The server assigns all IDs. Each time a sync is done, the first thing that is checked are \"How many IDs do I have left on this client?\" If the client is running low, it asks the server for a new block of IDs. The client then uses IDs in that range for new records. This works great for most needs, if you can assign a block large enough that it should \"never\" run out before the next sync, but not so large that the server runs out over time. If the client ever does run out, the handling can be pretty simple, just tell the user \"sorry you cannot add more items until you sync\"... if they are adding that many items, shouldn\'t they sync to avoid stale data issues anyway?

I think this is superior to using random GUIDs because random GUIDs are not 100% safe, and usually need to be much longer than a standard ID (128-bits vs 32-bits). You usually have indexes by ID and often keep ID numbers in memory, so it is important to keep them small.

Didn\'t really want to post as answer, but I don\'t know that anyone would see as a comment, and I think it\'s important to this topic and not included in other answers.



回答8:

First you should rethink how many data, tables and relations you will have. In my solution I’ve implemented syncing through Dropbox files. I observe changes in main MOC and save these data to files (each row is saved as gzipped json). If there is an internet connection working, I check if there are any changes on Dropbox (Dropbox gives me delta changes), download them and merge (latest wins), and finally put changed files. Before sync I put lock file on Dropbox to prevent other clients syncing incomplete data. When downloading changes it’s safe that only partial data is downloaded (eg lost internet connection). When downloading is finished (fully or partial) it starts to load files into Core Data. When there are unresolved relations (not all files are downloaded) it stops loading files and tries to finish downloading later. Relations are stored only as GUID, so I can easly check which files to load to have full data integrity. Syncing is starting after changes to core data are made. If there are no changes, than it checks for changes on Dropbox every few minutes and on app startup. Additionaly when changes are sent to server I send a broadcast to other devices to inform them about changes, so they can sync faster. Each synced entity has GUID property (guid is used also as a filename for exchange files). I have also Sync database where I store Dropbox revision of each file (I can compare it when Dropbox delta resets it’s state). Files also contain entity name, state (deleted/not deleted), guid (same as filename), database revision (to detect data migrations or to avoid syncing with never app versions) and of course the data (if row is not deleted).

This solution is working for thousands of files and about 30 entities. Instead of Dropbox I could use key/value store as REST web service which I want to do later, but have no time for this :) For now, in my opinion, my solution is more reliable than iCloud and, which is very important, I have full control on how it’s working (mainly because it’s my own code).

Another solution is to save MOC changes as transactions - there will be much less files exchanged with server, but it’s harder to do initial load in proper order into empty core data. iCloud is working this way, and also other syncing solutions have similar approach, eg TICoreDataSync.

-- UPDATE

After a while, I migrated to Ensembles - I recommend this solution over reinventing the wheel.



回答9:

2017

Regarding this incredibly old question.

It would be very much like asking

\"I want to buy a device which is a phone that I can carry with me - but also use for many computing tasks, even browsing the WWW!\"

Obviously, the answer to that one is if you\'ve been on Mars, one of the main technologies realized on this planet recently was \"smart phones\", buy one.

These days, creating an OCC system from scratch would be as insane as creating an SQL database from scratch.

Obviously, for OCC, which is the base paradigm of all non-trivial apps now, you use

  • Firebase
  • PubNub
  • Couchbase

and so on, which are quite simply, the major advance in human technology of the last few years.

Today, you would no more create OCC from scratch than you would

  • write your own operating system from scratch

  • write your own SQL database from scratch

  • write your own font-rendering from scratch

Note that indeed, in a professional sense you can\'t be \"an ios programmer\" or \"an android programmer\" any more.

Who cares about knowing how to layout tables and buttons?

You\'re a Firebase/whatever expert, and as an incidental side issue you know how to layout buttons, etc on ios or android.

The only issue is which BAAS to use - for example, maybe PlayFab if it is game oriented, maybe PubNub if it is really message driven, maybe ably.io, maybe kinvey if you\'re corporate - whatever.