Find centralized, trusted content and collaborate around the technologies you use most.
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
假设我有一个数据集
0,11,2,3,4,5,56,7 0,1,21,13,45,5,61,75 01,1,2,3,54,55,6,75
我要做的是将值平面映射到作为列索引的键和作为值的值。谁能给我指导?我发现很难获得列索引。
假设您的 RDD 是序列类型,您可以这样做:
rdd.flatMap(line => line.zipWithIndex.map(tuple => tuple.swap))
创建键值对,其中键是列表索引,值是该索引处的值,可能如下所示:
rdd.flatMap(lambda x: enumerate(x))
这当然是假设您的数据已经是 RDD。