我已经解析了数据并生成了以下 RDD:
x [RDD] = (458817,(CompactBuffer(20),CompactBuffer((837063182,0,1433142639864), (676690466,0,1433175090184), (4642913327036075112,1,1433177284025), (464291332,1,1433182403135), (4642913327036075112,0,1433185531150),
(464291332,0,1433186067803), (4642913327036075112,1,1433186266561), (851805971,0,1433190829047),
(6376558263039679112,1,1433203286945), (837063182,0,1433226615856), (8403476884799939112,0,1433287740066),
(764990231,0,1433289484047), (4642913327036075112,0,1433351165901), (464291332,1,1433351892238),
(4642913327036075112,0,1433374808826), (584492430,1,1433436093253))))
在这里,我只显示 RDD 中的记录,我的目标是获得以下 RDD:我在哪里附加了第一个元素。
(458817,837063182,0,1433142639864)
(458817,676690466,0,1433175090184)
(458817,464291332,1,1433177284025)
(458817,464291332,1,1433182403135)
(458817,464291332,0,1433185531150)
(458817,464291332,0,1433186067803)
(458817,464291332,1,1433186266561)
(458817,851805971,0,1433190829047)
(458817,637655826,1,1433203286945)
(458817,837063182,0,1433226615856)
通过执行 flatMap 我失去了第一个元素并且无法访问它:
val r = x.map(l => l._2).flatMap(x => x._2).map(x => (x._1, x._2, x._3, x._4))