这是我发布的另一个问题的后续内容,该问题是关于使用 ETL 将一个简单的数据库导入 OrientDB,该 ETL 具有边和顶点属性,两者都有日期。
这是我的数据:
顶点.csv:
label,data,date
v01,0.1234,2015-01-01
v02,0.5678,2015-01-02
v03,0.9012,2015-01-03
边缘.csv:
u,v,weight,date
v01,v02,12.4,2015-06-17
v02,v03,17.9,2015-09-14
为简洁起见,我将使用其他问题的编辑内容仅添加更新后的 commonEdges.json 文件。其他 JSON 文件保持不变。
commonEdges.json:
{
"begin": [ { "let": { "name": "$filePath", "expression": "$fileDirectory.append($fileName )" } } ],
"config": { "log": "info" },
"source": { "file": { "path": "$filePath" } },
"extractor": { "csv": { "ignoreEmptyLines": true,
"nullValue": "N/A",
"dateFormat": "yyyy-mm-dd"
}
},
"transformers": [
{ "merge": { "joinFieldName": "u", "lookup": "myVertex.label" } },
{ "edge": { "class": "myEdge",
"joinFieldName": "v",
"lookup": "myVertex.label",
"edgeFields": { "weight": "${input.weight}", "date": "${input.date}" },
"direction": "out",
"unresolvedLinkAction": "NOTHING"
}
},
{ "field": { "fieldNames": ["u", "v"], "operation": "remove" } }
],
"loader": {
"orientdb": {
"dbURL": "plocal:my_orientdb",
"dbType": "graph",
"batchCommit": 1000,
"useLightweightEdges": false,
"classes": [ { "name": "myEdge", "extends", "E" } ],
"indexes": []
}
}
}
加载图表后,日期字段仍然被破坏。
如果我不加载边,这是顶点表:
orientdb {db=my_orientdb}> SELECT FROM myVertex
+----+-----+--------+------+-------------------+-----+
|# |@RID |@CLASS |data |date |label|
+----+-----+--------+------+-------------------+-----+
|0 |#25:0|myVertex|0.1234|2015-01-01 00:01:00|v01 |
|1 |#26:0|myVertex|0.5678|2015-01-02 00:01:00|v02 |
|2 |#27:0|myVertex|0.9012|2015-01-03 00:01:00|v03 |
+----+-----+--------+------+-------------------+-----+
一切看起来都不错,日期是 1/1/15 - 1/3/15。
在我加载边缘之后,日期字段是错误的:
orientdb {db=my_orientdb}> SELECT FROM myVertex
+----+-----+--------+------+-------------------+-----+------+----------+---------+
|# |@RID |@CLASS |data |date |label|weight|out_myEdge|in_myEdge|
+----+-----+--------+------+-------------------+-----+------+----------+---------+
|0 |#25:0|myVertex|0.1234|2015-01-17 00:06:00|v01 |12.4 |[#33:0] | |
|1 |#26:0|myVertex|0.5678|2015-01-14 00:09:00|v02 |17.9 |[#34:0] |[#33:0] |
|2 |#27:0|myVertex|0.9012|2015-01-03 00:01:00|v03 | | |[#34:0] |
+----+-----+--------+------+-------------------+-----+------+----------+---------+
边缘的日期也不正确:
orientdb {db=my_orientdb}> SELECT FROM myEdge
+----+-----+------+-----+-----+------+-------------------+
|# |@RID |@CLASS|out |in |weight|date |
+----+-----+------+-----+-----+------+-------------------+
|0 |#33:0|myEdge|#25:0|#26:0|12.4 |2015-01-17 00:06:00|
|1 |#34:0|myEdge|#26:0|#27:0|17.9 |2015-01-14 00:09:00|
+----+-----+------+-----+-----+------+-------------------+
看起来 OrientDB 正在用已经加载的日期破坏月份中的某一天......但是边缘的月份字段以某种方式被放入分钟字段中。对于顶点和边,它也以这种方式显示。
这只是 OrientDB 的一个错误,还是我的 ETL 文件中缺少某些内容?
提前感谢您的任何帮助或建议。