0

我正在扩大我的视野,目前在 Neo4j 中摆弄。在 Udemy 上学习了几门课程,我想我会掌握它足以加载自定义数据集:)

我想加载一个 PlayerUnknown 的 Battlegrounds 数据集。来源:https ://www.kaggle.com/skihikingkevin/pubg-match-deaths 数据集:kill_match_stats_final_0.csv

使数据可读:

LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_final_0.csv' AS row
WITH row.killed_by AS MurderWeapon, row.killer_name AS Murderer, toInteger(row.killer_placement) AS RankMurderer, 
     row.killer_position_x AS MurderPositionX, row.killer_position_y AS MurderPositionY, 
     row.map AS Map, row.match_id AS MatchID, toInteger(row.time) AS TimeOfDeathSec, 
     row.victim_name AS Victim, toInteger(row.victim_placement) AS RankVictim,
     row.victim_position_x AS VictimPositionX, row.victim_position_y AS VictimPositionY
RETURN MurderWeapon, Murderer, RankMurderer, MurderPositionX, MurderPositionY, Map, MatchID, TimeOfDeathSec, Victim, RankVictim, VictimPositionX, VictimPositionY
LIMIT 5;

我的想法是创建 2 个节点:带有 Player 标签的 Murderer 和 Victim 边缘将被杀死: Node-edge-schematic

当我想加载数据集时出现错误,由于 'name' 的属性值为空,无法合并以下节点:(:Player {name: null})

起初我认为就地整数转换是问题所在。所以我删除了那些,但这并没有解决问题。我正在尝试运行此语句:

LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_final_0.csv' AS row
WITH row
MERGE (Murderer:Player{name:row.killer_name, rank:row.killer_placement})
MERGE (Victim:Player{name:row.victim_name, rank:row.victim_placement})
MERGE (Murderer)-[killed:Killed{
                                        `Killed With`:row.killed_by,    
                                        `KillerX`:row.killer_position_x, 
                                        `KillerY`:row.killer_position_y, 
                                        `Map`:row.map, 
                                        `MatchID`:row.match_id, 
                                        `Time of Death`:row.time, 
                                        `VictimX`:row.victim_position_x, 
                                        `VictimY`:row.victim_position_y
}]->(Victim)
;

我感觉它正盯着我的脸,但我看不到它:P

问题我加载 csv 文件的语句有什么问题?

您可以在这里下载简短版本,而不是下载大文件:https ://storage.stijvehark.nl/s/OmdSL2oljVIyG2hx

更新 1

在@Graphileon 发表评论后,我对数据有了新的认识。我假设(是的,我知道....)所有列都包含数据。我用了他的脚本,运行良好。所以我尝试了这个:

LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
RETURN row

这看起来也不错。检查数据集的结果,我发现:

{
  "killer_name": null,
  "victim_position_y": "0.0",
  "victim_position_x": "0.0",
  "killer_position_x": null,
  "victim_placement": "26.0",
  "killer_position_y": null,
  "match_id": "2U4GBNA0YmnLSqvEycnTjo-KT000vfUnhSA2vfVhVPe1QBwCTNTBJ5B_1Ocel6nY",
  "victim_name": "xuezhiqian717",
  "killed_by": "Bluezone",
  "killer_placement": null,
  "time": "879",
  "map": "MIRAMAR"
}

例如,当你自杀、摔倒或用手榴弹自杀时,我已经很好奇数据将如何呈现。稍后我将对此进行研究。您的建议 我喜欢您对播放器的建议。我会尝试使用它。

更新 2

有些令人头疼,但我设法通过以下方式导入了所有玩家:

// Add constraint
CREATE CONSTRAINT ON (p:Player) ASSERT p.name IS UNIQUE

// Create nodes:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
MERGE (:Player{name:
                    CASE WHEN row.killer_name IS NOT NULL 
                        THEN row.killer_name
                        ELSE 'System-' + row.killed_by END
            })
MERGE (:Player{name:
                CASE WHEN row.victim_name IS NOT NULL 
                    THEN row.victim_name
                    ELSE 'System-' + row.killed_by END
                })

这导入了所有玩家,对于因蓝区而被杀或确实摔死的玩家,我添加了一个用户“系统-”

现在用于创建边缘:

 // Create edges:
LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_smalll_batch.csv' AS row
WITH row
MERGE (Player)-[killed:Killed{
                                        `Killed With`:row.killed_by,    
                                        `KillerX`:
                                            CASE WHEN row.killer_position_x IS NOT NULL 
                                            THEN row.killer_position_x
                                            ELSE '0' END, 
                                        `KillerY`:
                                            CASE WHEN row.killer_position_y IS NOT NULL 
                                            THEN row.killer_position_y
                                            ELSE '0' END, 
                                        `Map`:row.map, 
                                        `MatchID`:row.match_id, 
                                        `Time of Death`:row.time, 
                                        `VictimX`:
                                            CASE WHEN row.victim_position_x IS NOT NULL 
                                            THEN row.victim_position_x
                                            ELSE '0' END,
                                        `VictimY`:
                                            CASE WHEN row.victim_position_y IS NOT NULL 
                                            THEN row.victim_position_y
                                            ELSE '0' END
}]->(Player)

这没有按计划进行:P

在此处输入图像描述

接下来要弄清楚这一点,关于如何解决这个问题的任何指导?

4

1 回答 1

0

MERGE达到值时,这意味着您的Killer_name和/或victim_name字段中有空值的行。查找这些行的一种方法:

LOAD CSV WITH HEADERS FROM 'file:///kill_match_stats_final_0.csv' AS row
WITH row
WHERE TRIM(COALESCE(row.killer_name,'')) = '' 
      OR 
      TRIM(COALESCE(row.victim_name,'')) = ''
RETURN row

另外关于您选择的型号。如果杀死某人的玩家以后可能成为受害者,我会考虑仅使用 :Player 节点并在 Player.name 上设置唯一的 CONSTRAINT。谁是凶手,谁是受害者,可以从关系的方向推导出来。如果您将节点标记为 :Murderer 和 :Victim 将强制您创建两个节点,以防凶手在某个时间点成为受害者,并且您将有两个同名的玩家

于 2021-12-18T19:43:03.563 回答