I have activities that are stored in a graph database. Multiple activities are grouped and aggregated into 1 activity in some circumstances.
A processed activity feed could look like this:
Activity 1
Activity 2
Grouped Activity
Activity 3
Activity 4
Activity 5
Activities have an updated timestamp and a unique id.
The activities are ordered by their updated time and in the case of a grouped activity, the most recent updated time within its child activities is used.
Activities can be inserted anywhere in the list (for example, if we start following someone, their past activities would be inserted into the list).
Activities can be removed from anywhere in the list.
Due to the amount of data, using the timestamp with microseconds can still result in conflicts (2 items can have the same timestamp).
Cursor identifiers should be unique and stable. Adding and removing feed items should not change the identifier.
I would like to introduce cursor based paging to allow clients to paginate through the feed similar to twitter's. There doesn't seem to be much information on how they are built as I have only found this blog post talking about implementing them. However it seems to have a problem if the cursor's identifier happens to be pointing to the item that was removed.
With the above, how can I produce an identifier that can be used as a cursor for the above? Initially, I considered combining the timestamp with the unique id: 1371813798111111.myuniqueid
. However, if the item at 1371813798111111.myuniqueid
is deleted, I can get the items with the 1371813798111111
timestamp, but would not be able to determine which item with that timestamp I should start with.
Another approach I had was to assign an incrementing number to each feed result. Since the number is incrementing and in order, if the number/id is missing, I can just choose the next one. However, the problem with this is that the cursor ids will change if I start removing and adding feed items in the middle of the feed. One solution I had to this problem is to have a huge gap between each number, but it is difficult to determine how new items can be added to the space between each number in a deterministic way. In addition, as the new items are added, and the gaps are being filled up, we would end up with the same problem.
Simply put, if I have a list of items where items can be added and removed from anywhere in the list, what is the best way to generate an id for each list item such that if the item for the id is deleted, I can still determine its position in the list?