Does gremlin provide the ability to clone a vertex for instance
v1->v2, v1->v3, v1->v4
how can I simply and efficiently create a new vertex v5
that also has edges that point to v2, v3, v4
(the same places that v1's
edges point to) without have to explicitly set them and instead saying something like g.createV(v1).clone(v2)
.
Note that I am using the AWS Neptune version of gremlin, solution must be compatible with that.
A clone
step doesn't exist (yet), but it can be solved with a single query.
Let's start with some sample data:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(4).valueMap(true) // the vertex to be cloned
==>[label:person,name:[josh],age:[32],id:4]
gremlin> g.V(4).outE().map(union(identity(), valueMap()).fold()) // all out-edges
==>[e[10][4-created->5],[weight:1.0]]
==>[e[11][4-created->3],[weight:0.4]]
gremlin> g.V(4).inE().map(union(identity(), valueMap()).fold()) // all in-edges
==>[e[8][1-knows->4],[weight:1.0]]
Now the query to clone the vertex might look a bit scary at a first glance, but it's really just the same pattern over and over again - jumping between the original and the clone to copy the properties:
g.V(4).as('source').
addV().
property(label, select('source').label()).as('clone').
sideEffect( // copy vertex properties
select('source').properties().as('p').
select('clone').
property(select('p').key(), select('p').value())).
sideEffect( // copy out-edges
select('source').outE().as('e').
select('clone').
addE(select('e').label()).as('eclone').
to(select('e').inV()).
select('e').properties().as('p'). // copy out-edge properties
select('eclone').
property(select('p').key(), select('p').value())).
sideEffect( // copy in-edges
select('source').inE().as('e').
select('clone').
addE(select('e').label()).as('eclone').
from(select('e').outV()).
select('e').properties().as('p'). // copy in-edge properties
select('eclone').
property(select('p').key(), select('p').value()))
And in action it looks like this:
gremlin> g.V(4).as('source').
......1> addV().
......2> property(label, select('source').label()).as('clone').
......3> sideEffect(
......4> select('source').properties().as('p').
......5> select('clone').
......6> property(select('p').key(), select('p').value())).
......7> sideEffect(
......8> select('source').outE().as('e').
......9> select('clone').
.....10> addE(select('e').label()).as('eclone').
.....11> to(select('e').inV()).
.....12> select('e').properties().as('p').
.....13> select('eclone').
.....14> property(select('p').key(), select('p').value())).
.....15> sideEffect(
.....16> select('source').inE().as('e').
.....17> select('clone').
.....18> addE(select('e').label()).as('eclone').
.....19> from(select('e').outV()).
.....20> select('e').properties().as('p').
.....21> select('eclone').
.....22> property(select('p').key(), select('p').value()))
==>v[13]
gremlin> g.V(13).valueMap(true) // the cloned vertex
==>[label:person,name:[josh],age:[32],id:13]
gremlin> g.V(13).outE().map(union(identity(), valueMap()).fold()) // all cloned out-edges
==>[e[16][13-created->5],[weight:1.0]]
==>[e[17][13-created->3],[weight:0.4]]
gremlin> g.V(13).inE().map(union(identity(), valueMap()).fold()) // all cloned in-edges
==>[e[18][1-knows->13],[weight:1.0]]
UPDATE
Paging support is a little tricky. Let me split this whole thing into a 3-step process. I will use edge ids as the sort criterion and to identify the last processed edge (this might not work in Neptune, but you can use a unique sortable property instead).
// clone the vertex with its properties
clone = g.V(4).as('source').
addV().
property(label, select('source').label()).as('clone').
sideEffect(
select('source').properties().as('p').
select('clone').
property(select('p').key(), select('p').value())).next()
// clone out-edges
pageSize = 1
lastId = -1
while (true) {
t = g.V(4).as('source').
outE().hasId(gt(lastId)).
order().by(id).limit(pageSize).as('e').
group('x').
by(constant('lastId')).
by(id()).
V(clone).
addE(select('e').label()).as('eclone').
to(select('e').inV()).
sideEffect(
select('e').properties().as('p').
select('eclone').
property(select('p').key(), select('p').value())).
count()
if (t.next() != pageSize)
break
lastId = t.getSideEffects().get('x').get('lastId')
}
// clone in-edges
lastId = -1
while (true) {
t = g.V(4).as('source').
inE().hasId(gt(lastId)).
order().by(id).limit(pageSize).as('e').
group('x').
by(constant('lastId')).
by(id()).
V(clone).
addE(select('e').label()).as('eclone').
from(select('e').inV()).
sideEffect(
select('e').properties().as('p').
select('eclone').
property(select('p').key(), select('p').value())).
count()
if (t.next() != pageSize)
break
lastId = t.getSideEffects().get('x').get('lastId')
}
I don't know if Neptune allows you to execute full scripts - if not, you'll need to execute the outer while loops in you application's code.