Tinkerpop/gremlin merge vertices (and edges)

2019-02-26 22:51发布

问题:

Is there an easy way to replace or merge vertices and keep/merge existing edges? Or just manually copy all properties from the vertex and recreate existing edges and all (meta-)properties and then drop the superfluous vertex?

回答1:

Alright, as mentioned in the comments above, you're going to do the matching in OLTP. That means you'll likely have a concrete entry point. Let's make up a simple sample graph:

g = TinkerGraph.open().traversal()

// Stackoverflow data
g.addV("user").property("login", "user3508638").as("a").
  addV("user").property("login", "dkuppitz").property("age", 35).as("b").
  addV("question").property("title", "Tinkerpop/gremlin merge vertices (and edges)").as("c").
  addE("posted").from("a").to("c").
  addE("commented").from("b").to("c").property("time", 123).iterate()

// Github data
g.addV("user").property("login", "dkuppitz").property("name", "Daniel Kuppitz").as("a").
  addV("project").property("title", "TinkerPop").as("b").
  addE("contributed").from("a").to("b").iterate()

To match vertices based on login dkuppitz and merge them into a single user vertex:

g.V().has("login", "dkuppitz").
  fold().filter(count(local).is(gt(1))).unfold().
  sideEffect(properties().group("p").by(key).by(value())).
  sideEffect(outE().group("o").by(label).by(project("p","iv").by(valueMap()).by(inV()).fold())).
  sideEffect(inE().group("i").by(label).by(project("p","ov").by(valueMap()).by(outV()).fold())).
  sideEffect(drop()).
  cap("p","o","i").as("poi").
  addV("user").as("u").
  sideEffect(
    select("poi").select("p").unfold().as("kv").
    select("u").property(select("kv").select(keys), select("kv").select(values))).
  sideEffect(
    select("poi").select("o").unfold().as("x").
    select("u").sideEffect { u ->
      u.path("x").getValue().each { x ->
        def e = u.get().addEdge(u.path("x").getKey(), x.get("iv"))
        x.get("p").each { p ->
          e.property(p.getKey(), p.getValue())
        }
      }
    }).
  sideEffect(
    select("poi").select("i").unfold().as("x").
    select("u").sideEffect { u ->
      u.path("x").getValue().each { x ->
        def e = x.get("ov").addEdge(u.path("x").getKey(), u.get())
        x.get("p").each { p ->
          e.property(p.getKey(), p.getValue())
        }
      }
    }).iterate()

I know, the query is crazy complicated, especially with the deeply nested lambdas. But unfortunately there's no way around the lambdas, since we don't have an addE(<traversal>) overload (I created a ticket though). Anyway, after executing the query above, the graph looks like this:

gremlin> g.V().valueMap()
==>[login:[user3508638]]
==>[title:[Tinkerpop/gremlin merge vertices (and edges)]]
==>[title:[TinkerPop]]
==>[name:[Daniel Kuppitz],login:[dkuppitz],age:[35]]
gremlin> g.V().has("login", "dkuppitz").bothE()
==>e[19][15-commented->5]
==>e[20][15-contributed->12]
gremlin> g.V().has("login", "dkuppitz").bothE().valueMap(true)
==>[label:commented,time:123,id:19]
==>[label:contributed,id:20]

Both dkuppitz vertices were merged into one (name and age properties are present) and the 2 edges were recreated accordingly.

UPDATE:

With TINKERPOP-1793 we can get rid of all lambdas:

g.V().has("login", "dkuppitz").
  fold().filter(count(local).is(gt(1))).unfold().
  sideEffect(properties().group("p").by(key).by(value())).
  sideEffect(outE().group("o").by(label).by(project("p","iv").by(valueMap()).by(inV()).fold())).
  sideEffect(inE().group("i").by(label).by(project("p","ov").by(valueMap()).by(outV()).fold())).
  sideEffect(drop()).
  cap("p","o","i").as("poi").
  addV("user").as("u").
  sideEffect(
    select("poi").select("p").unfold().as("kv").
    select("u").property(select("kv").select(keys), select("kv").select(values))).
  sideEffect(
    select("poi").select("o").unfold().as("x").select(values).
    unfold().addE(select("x").select(keys)).from(select("u")).to(select("iv"))).
  sideEffect(
    select("poi").select("i").unfold().as("x").select(values).
    unfold().addE(select("x").select(keys)).from(select("ov")).to(select("u"))).iterate()