Short incremental uinque id for neo4j

I use django with neo4j as database. I need to use short url based on node ids in my rest api. In neo4j there is an id used in database that didn't recommended to use in app, and there is approach to use uuid that is too long for my short urls. So I add my uid generator:

def uid_generator():
    last_id = db.cypher_query("MATCH (n) RETURN count(*) AS lastId")[0][0][0]
    if last_id is None:
        last_id = 0
    last_id = str(last_id)
    hash = sha256()
    hash.update(str(time.time()).encode())
    return hash.hexdigest()[0:(max(2, len(last_id)))] + str(uuid.uuid4()).replace('-', '')[0:(max(2, len(last_id)))]

I have two question, First I read this question in stack overflow and still not sure that MATCH (n) RETURN count(*) AS lastId is O(1) there was no reference to that! Is there any reference for that answer? Second is there a better approach to do in both id uniqueness and speed?

标签： django python-3.x neo4j cypher

3条回答

Luminary・发光体

2楼-- · 2019-08-07 18:25

First, you should put a unique constraint on the id property to make sure there are no collisions created by parallel create statements. This requires using a label, but you NEED this fail-safe if you plan to do anything serious with this data. But this way, you can have rolling ids for different labels. (All indexed labels will have a count table. UNIQUE CONSTRAINT also creates an index)

Second, you should do the generation and creation in the same cypher like this

MATCH (n:Node) WITH count(*) AS lastId
CREATE (:Node{id:lastId})

This will minimize time between generation and commit, reducing chances of collision. (Remember to retry on failed attempts from unique violations)

I'm not sure what you are doing with the hash, just that you are doing it wrong. Either you generate a new time based UUID (It will require no parameters) and use it as is, or you use the incriminating id. (By altering a UUID, you invalidate the logic that guaranteed uniqueness, thus significantly increasing collision chance)

You can also store the current index count in a node like is explained here. It's not guaranteed to be thread safe, but shouldn't be a problem as long as you have Unique Constraints in place, and retry on constraint violations. This will be more tolerant of deleting nodes.

0人赞添加讨论(0) 举报

劫难

3楼-- · 2019-08-07 18:30

Why not create your own identifier? You can get the maximum of your last identifier (let's call it RN for record number).

match (n) return max(n.RN) as lastID

max is one of several numeric functions in cypher.

0人赞添加讨论(0) 举报

啃猪蹄的小仙女

4楼-- · 2019-08-07 18:33

Your approach is not good because it's based on the number of node in the database.

What happened if you create a node (call it A), and then delete a random node, and then create a new node (call it B).

A and B will have the same ID, and I think that's why you have added a hash in code based on the time (but I barely understand the line :)).

On the other side, Neo4j's ID ensure you to have a unique ID across the database, but not in the time. Per default, Neo4j recycle unused ID (an ID is release when a node is deleted).

You can change this behavour by changing the configuration (see the doc HERE ) : dbms.ids.reuse.types.override=RELATIONSHIP

Becarefull with such a configuration, the size of your database on your harddrive can only increase, even if you delete nodes.

0人赞添加讨论(0) 举报

Short incremental uinque id for neo4j

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间