我正在扩大我的视野,目前在 Neo4j 中摆弄。在 Udemy 上学习了几门课程,我想我会掌握它足以加载自定义数据集:)
我想加载一个 PlayerUnknown 的 Battlegrounds 数据集。来源:https ://www.kaggle.com/skihikingkevin/pubg-match-deaths 数据集:kill_match_stats_final_0.csv
使数据可读:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_final_0.csv' AS row
WITH row.killed_by AS MurderWeapon, row.killer_name AS Murderer, toInteger(row.killer_placement) AS RankMurderer,
row.killer_position_x AS MurderPositionX, row.killer_position_y AS MurderPositionY,
row.map AS Map, row.match_id AS MatchID, toInteger(row.time) AS TimeOfDeathSec,
row.victim_name AS Victim, toInteger(row.victim_placement) AS RankVictim,
row.victim_position_x AS VictimPositionX, row.victim_position_y AS VictimPositionY
RETURN MurderWeapon, Murderer, RankMurderer, MurderPositionX, MurderPositionY, Map, MatchID, TimeOfDeathSec, Victim, RankVictim, VictimPositionX, VictimPositionY
LIMIT 5;
我的想法是创建 2 个节点:带有 Player 标签的 Murderer 和 Victim 边缘将被杀死: Node-edge-schematic
当我想加载数据集时出现错误,由于 'name' 的属性值为空,无法合并以下节点:(:Player {name: null})
起初我认为就地整数转换是问题所在。所以我删除了那些,但这并没有解决问题。我正在尝试运行此语句:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_final_0.csv' AS row
WITH row
MERGE (Murderer:Player{name:row.killer_name, rank:row.killer_placement})
MERGE (Victim:Player{name:row.victim_name, rank:row.victim_placement})
MERGE (Murderer)-[killed:Killed{
`Killed With`:row.killed_by,
`KillerX`:row.killer_position_x,
`KillerY`:row.killer_position_y,
`Map`:row.map,
`MatchID`:row.match_id,
`Time of Death`:row.time,
`VictimX`:row.victim_position_x,
`VictimY`:row.victim_position_y
}]->(Victim)
;
我感觉它正盯着我的脸,但我看不到它:P
问题我加载 csv 文件的语句有什么问题?
您可以在这里下载简短版本,而不是下载大文件:https ://storage.stijvehark.nl/s/OmdSL2oljVIyG2hx
更新 1
在@Graphileon 发表评论后,我对数据有了新的认识。我假设(是的,我知道....)所有列都包含数据。我用了他的脚本,运行良好。所以我尝试了这个:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
RETURN row
这看起来也不错。检查数据集的结果,我发现:
{
"killer_name": null,
"victim_position_y": "0.0",
"victim_position_x": "0.0",
"killer_position_x": null,
"victim_placement": "26.0",
"killer_position_y": null,
"match_id": "2U4GBNA0YmnLSqvEycnTjo-KT000vfUnhSA2vfVhVPe1QBwCTNTBJ5B_1Ocel6nY",
"victim_name": "xuezhiqian717",
"killed_by": "Bluezone",
"killer_placement": null,
"time": "879",
"map": "MIRAMAR"
}
例如,当你自杀、摔倒或用手榴弹自杀时,我已经很好奇数据将如何呈现。稍后我将对此进行研究。您的建议 我喜欢您对播放器的建议。我会尝试使用它。
更新 2
有些令人头疼,但我设法通过以下方式导入了所有玩家:
// Add constraint
CREATE CONSTRAINT ON (p:Player) ASSERT p.name IS UNIQUE
// Create nodes:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
MERGE (:Player{name:
CASE WHEN row.killer_name IS NOT NULL
THEN row.killer_name
ELSE 'System-' + row.killed_by END
})
MERGE (:Player{name:
CASE WHEN row.victim_name IS NOT NULL
THEN row.victim_name
ELSE 'System-' + row.killed_by END
})
这导入了所有玩家,对于因蓝区而被杀或确实摔死的玩家,我添加了一个用户“系统-”
现在用于创建边缘:
// Create edges:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
MERGE (Player)-[killed:Killed{
`Killed With`:row.killed_by,
`KillerX`:
CASE WHEN row.killer_position_x IS NOT NULL
THEN row.killer_position_x
ELSE '0' END,
`KillerY`:
CASE WHEN row.killer_position_y IS NOT NULL
THEN row.killer_position_y
ELSE '0' END,
`Map`:row.map,
`MatchID`:row.match_id,
`Time of Death`:row.time,
`VictimX`:
CASE WHEN row.victim_position_x IS NOT NULL
THEN row.victim_position_x
ELSE '0' END,
`VictimY`:
CASE WHEN row.victim_position_y IS NOT NULL
THEN row.victim_position_y
ELSE '0' END
}]->(Player)
这没有按计划进行:P
接下来要弄清楚这一点,关于如何解决这个问题的任何指导?