Find centralized, trusted content and collaborate around the technologies you use most.
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
假设我有一个 Array[Double] 的 RDD,有 n 列。我想在最后一列上应用过滤器(例如,值 > 某个常量)。
类似的东西
val rdd: RDD[Array[Double]] = ... val filtered: RDD[Array[Double]] = rdd.filter(arr => arr.last() > some_value)
我认为选择数组或向量并不重要。Spark 的总体开销远高于阵列与向量的性能/内存优势