This is a follow-up to another question I had posted regarding importing a simple database into OrientDB using ETL that had both edge and vertex properties with dates on both.
Here's my data:
vertices.csv:
label,data,date
v01,0.1234,2015-01-01
v02,0.5678,2015-01-02
v03,0.9012,2015-01-03
edges.csv:
u,v,weight,date
v01,v02,12.4,2015-06-17
v02,v03,17.9,2015-09-14
For brevity, I'll add just the updated commonEdges.json file using the edits from the other question. The other JSON files are unchanged.
commonEdges.json:
{
"begin": [ { "let": { "name": "$filePath", "expression": "$fileDirectory.append($fileName )" } } ],
"config": { "log": "info" },
"source": { "file": { "path": "$filePath" } },
"extractor": { "csv": { "ignoreEmptyLines": true,
"nullValue": "N/A",
"dateFormat": "yyyy-mm-dd"
}
},
"transformers": [
{ "merge": { "joinFieldName": "u", "lookup": "myVertex.label" } },
{ "edge": { "class": "myEdge",
"joinFieldName": "v",
"lookup": "myVertex.label",
"edgeFields": { "weight": "${input.weight}", "date": "${input.date}" },
"direction": "out",
"unresolvedLinkAction": "NOTHING"
}
},
{ "field": { "fieldNames": ["u", "v"], "operation": "remove" } }
],
"loader": {
"orientdb": {
"dbURL": "plocal:my_orientdb",
"dbType": "graph",
"batchCommit": 1000,
"useLightweightEdges": false,
"classes": [ { "name": "myEdge", "extends", "E" } ],
"indexes": []
}
}
}
The Date fields are still getting clobbered after I load the graph.
Here's the vertex table if I don't load the edges:
orientdb {db=my_orientdb}> SELECT FROM myVertex
+----+-----+--------+------+-------------------+-----+
|# |@RID |@CLASS |data |date |label|
+----+-----+--------+------+-------------------+-----+
|0 |#25:0|myVertex|0.1234|2015-01-01 00:01:00|v01 |
|1 |#26:0|myVertex|0.5678|2015-01-02 00:01:00|v02 |
|2 |#27:0|myVertex|0.9012|2015-01-03 00:01:00|v03 |
+----+-----+--------+------+-------------------+-----+
Everything looks right, the dates are 1/1/15 - 1/3/15.
After I load the edges though, the date fields are wrong:
orientdb {db=my_orientdb}> SELECT FROM myVertex
+----+-----+--------+------+-------------------+-----+------+----------+---------+
|# |@RID |@CLASS |data |date |label|weight|out_myEdge|in_myEdge|
+----+-----+--------+------+-------------------+-----+------+----------+---------+
|0 |#25:0|myVertex|0.1234|2015-01-17 00:06:00|v01 |12.4 |[#33:0] | |
|1 |#26:0|myVertex|0.5678|2015-01-14 00:09:00|v02 |17.9 |[#34:0] |[#33:0] |
|2 |#27:0|myVertex|0.9012|2015-01-03 00:01:00|v03 | | |[#34:0] |
+----+-----+--------+------+-------------------+-----+------+----------+---------+
The dates on the edges are also incorrect:
orientdb {db=my_orientdb}> SELECT FROM myEdge
+----+-----+------+-----+-----+------+-------------------+
|# |@RID |@CLASS|out |in |weight|date |
+----+-----+------+-----+-----+------+-------------------+
|0 |#33:0|myEdge|#25:0|#26:0|12.4 |2015-01-17 00:06:00|
|1 |#34:0|myEdge|#26:0|#27:0|17.9 |2015-01-14 00:09:00|
+----+-----+------+-----+-----+------+-------------------+
It looks like OrientDB is clobbering the day-of-month with the dates that are already loaded... but the month field from the edges is getting put into the minute field somehow. It's also showing up this way for both vertices and edges.
Is this just a bug with OrientDB or am I missing something in my ETL files?
Thanks in advance for any help or advice.