OrientDB ETL Edge transformer 2 joinFieldName(s)

2019-05-24 14:33发布

问题:

with one joinFieldName and lookup the Edge transformer works perfect. However, now two keys is required, i.e. compound index in the lookup. How can two joinFieldNames be specified?

This is the scripted(post processing) version: Create edge Expands from (select from MC where sample=1 and mkey=6) to (select from Event where sample=1 and mcl=6).

This works, but is not suitable for production.

Can anyone help?

回答1:

you can simply add 2 joinFieldName(s) like

{ "edge": { "class": "Conn",
                "joinFieldName": "b1",
                "lookup": "A.a1",
                "joinFieldName": "b2",
                "lookup": "A.a2",
                "direction": "out"
            }}

see below my test data:

json1.json

{
  "source": { "file": { "path": "/home/ivan/Scrivania/cose/etl/stak39517796/data1.csv" } },
  "extractor": { "csv": {} },
  "transformers": [
    { "vertex": { "class": "A" } }
  ],
  "loader": {
    "orientdb": {
       "dbURL": "plocal:/home/ivan/OrientDB/db_installati/enterprise/orientdb-enterprise-2.2.10/databases/stack39517796",
       "dbType": "graph",
       "dbAutoCreate": true,
       "classes": [
         {"name": "A", "extends": "V"},
         {"name": "B", "extends": "V"},
         {"name": "Conn", "extends": "E"}
       ]
    }
  }
}

json2.json

{
  "source": { "file": { "path": "/home/ivan/Scrivania/cose/etl/stak39517796/data2.csv" } },
  "extractor": { "csv": {} },
  "transformers": [
    { "vertex": { "class": "B" } },
    { "edge": { "class": "Conn",
                "joinFieldName": "b1",
                "lookup": "A.a1",
                "joinFieldName": "b2",
                "lookup": "A.a2",
                "direction": "out"
            }}
  ],
  "loader": {
    "orientdb": {
       "dbURL": "plocal:/home/ivan/OrientDB/db_installati/enterprise/orientdb-enterprise-2.2.10/databases/stack39517796",
       "dbType": "graph",
       "dbAutoCreate": true,
       "classes": [
         {"name": "A", "extends": "V"},
         {"name": "B", "extends": "V"},
         {"name": "Conn", "extends": "E"}
       ]
    }
  }
}

data1.csv

a1,a2
1,1
1,2
2,3

data2.csv

b1,b2
1,1
2,3
1,2

execution order:

  1. json1
  2. json2

and here is the final result:

orientdb {db=stack39517796}> select from v                                        

+----+-----+------+----+----+-------+----+----+--------+
|#   |@RID |@CLASS|a1  |a2  |in_Conn|b2  |b1  |out_Conn|
+----+-----+------+----+----+-------+----+----+--------+
|0   |#17:0|A     |1   |1   |[#25:0]|    |    |        |
|1   |#18:0|A     |1   |2   |[#27:0]|    |    |        |
|2   |#19:0|A     |2   |3   |[#26:0]|    |    |        |
|3   |#21:0|B     |    |    |       |1   |1   |[#25:0] |
|4   |#22:0|B     |    |    |       |3   |2   |[#26:0] |
|5   |#23:0|B     |    |    |       |2   |1   |[#27:0] |
+----+-----+------+----+----+-------+----+----+--------+


标签: orientdb etl