我试图在 Pyspark 中对大型数据集进行矩阵乘法,在乘法之后,我得到的结果是密集向量的 Rowmatrix,如下所示
[
DenseVector([-0.0075, -0.0021, 0.0021, -0.0082, -0.004]),
DenseVector([-0.0035, 2.4358, -0.0005, -0.0032, -0.005]),
DenseVector([-0.0019, -0.0623, -0.0093, -0.0101, -0.002]),
DenseVector([-0.0075, -0.0021, 0.0021, -0.0082, -0.004]),
DenseVector([-0.0035, 2.4358, -0.0005, -0.0032, -0.005]),
DenseVector([-0.0019, -0.0623, -0.0093, -0.0101, -0.002])
]
我有一组行和列标签
对于列标签,我有 ["c1", "c2", "c3", "c4", "c5"] 用于行索引,我有 ["r1","r2","r3","r4"," r5","r6"]
现在我想将此行矩阵转换为 pyspark 数据帧,如下所示
直到现在,我还没有办法得到这个。我需要这方面的建议。