gremlin clone a node and its edges

2020-02-11 12:24发布

问题:

Does gremlin provide the ability to clone a vertex for instance v1->v2, v1->v3, v1->v4 how can I simply and efficiently create a new vertex v5 that also has edges that point to v2, v3, v4 (the same places that v1's edges point to) without have to explicitly set them and instead saying something like g.createV(v1).clone(v2).

Note that I am using the AWS Neptune version of gremlin, solution must be compatible with that.

回答1:

A clone step doesn't exist (yet), but it can be solved with a single query.

Let's start with some sample data:

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(4).valueMap(true)                                   // the vertex to be cloned
==>[label:person,name:[josh],age:[32],id:4]
gremlin> g.V(4).outE().map(union(identity(), valueMap()).fold()) // all out-edges
==>[e[10][4-created->5],[weight:1.0]]
==>[e[11][4-created->3],[weight:0.4]]
gremlin> g.V(4).inE().map(union(identity(), valueMap()).fold())  // all in-edges
==>[e[8][1-knows->4],[weight:1.0]]

Now the query to clone the vertex might look a bit scary at a first glance, but it's really just the same pattern over and over again - jumping between the original and the clone to copy the properties:

g.V(4).as('source').
  addV().
    property(label, select('source').label()).as('clone').
  sideEffect(                                                // copy vertex properties
    select('source').properties().as('p').
    select('clone').
      property(select('p').key(), select('p').value())).
  sideEffect(                                                // copy out-edges
    select('source').outE().as('e').
    select('clone').
    addE(select('e').label()).as('eclone').
      to(select('e').inV()).
    select('e').properties().as('p').                        // copy out-edge properties
    select('eclone').
      property(select('p').key(), select('p').value())).
  sideEffect(                                                // copy in-edges
    select('source').inE().as('e').
    select('clone').
    addE(select('e').label()).as('eclone').
      from(select('e').outV()).
    select('e').properties().as('p').                        // copy in-edge properties
    select('eclone').
      property(select('p').key(), select('p').value()))

And in action it looks like this:

gremlin> g.V(4).as('source').
......1>   addV().
......2>     property(label, select('source').label()).as('clone').
......3>   sideEffect(
......4>     select('source').properties().as('p').
......5>     select('clone').
......6>       property(select('p').key(), select('p').value())).
......7>   sideEffect(
......8>     select('source').outE().as('e').
......9>     select('clone').
.....10>     addE(select('e').label()).as('eclone').
.....11>       to(select('e').inV()).
.....12>     select('e').properties().as('p').
.....13>     select('eclone').
.....14>       property(select('p').key(), select('p').value())).
.....15>   sideEffect(
.....16>     select('source').inE().as('e').
.....17>     select('clone').
.....18>     addE(select('e').label()).as('eclone').
.....19>       from(select('e').outV()).
.....20>     select('e').properties().as('p').
.....21>     select('eclone').
.....22>       property(select('p').key(), select('p').value()))
==>v[13]
gremlin> g.V(13).valueMap(true)                                   // the cloned vertex
==>[label:person,name:[josh],age:[32],id:13]
gremlin> g.V(13).outE().map(union(identity(), valueMap()).fold()) // all cloned out-edges
==>[e[16][13-created->5],[weight:1.0]]
==>[e[17][13-created->3],[weight:0.4]]
gremlin> g.V(13).inE().map(union(identity(), valueMap()).fold())  // all cloned in-edges
==>[e[18][1-knows->13],[weight:1.0]]

UPDATE

Paging support is a little tricky. Let me split this whole thing into a 3-step process. I will use edge ids as the sort criterion and to identify the last processed edge (this might not work in Neptune, but you can use a unique sortable property instead).

// clone the vertex with its properties
clone = g.V(4).as('source').
  addV().
    property(label, select('source').label()).as('clone').
  sideEffect(
    select('source').properties().as('p').
    select('clone').
      property(select('p').key(), select('p').value())).next()

// clone out-edges
pageSize = 1
lastId = -1
while (true) {
  t = g.V(4).as('source').
    outE().hasId(gt(lastId)).
    order().by(id).limit(pageSize).as('e').
    group('x').
      by(constant('lastId')).
      by(id()).
    V(clone).
    addE(select('e').label()).as('eclone').
      to(select('e').inV()).
    sideEffect(
      select('e').properties().as('p').
      select('eclone').
        property(select('p').key(), select('p').value())).
    count()
  if (t.next() != pageSize)
    break
  lastId = t.getSideEffects().get('x').get('lastId')
}

// clone in-edges
lastId = -1
while (true) {
  t = g.V(4).as('source').
    inE().hasId(gt(lastId)).
    order().by(id).limit(pageSize).as('e').
    group('x').
      by(constant('lastId')).
      by(id()).
    V(clone).
    addE(select('e').label()).as('eclone').
      from(select('e').inV()).
    sideEffect(
      select('e').properties().as('p').
      select('eclone').
        property(select('p').key(), select('p').value())).
    count()
  if (t.next() != pageSize)
    break
  lastId = t.getSideEffects().get('x').get('lastId')
}

I don't know if Neptune allows you to execute full scripts - if not, you'll need to execute the outer while loops in you application's code.