我有一个 pyspark 函数,但需要将其转换为 Scala
PySpark
for i in [c for c in r.columns if c.startswith("_")]:
r = r.withColumn(i, F.col(i)["id"])
由于 scala 数据类型是不可变的,因此 Scala 有什么更好的方法可以让我创建多个新列,而无需 val df1 = df.withcolumn, val df2 = df1.withcolumn 就像我在 pyspark 中所做的那样?
表r如下
+-----------+-------------+-------------+-------------+-------------+
| _0| _1| _2| _3| _4|
+-----------+-------------+-------------+-------------+-------------+
|[1, Carter]| [5, Banks]|[11, Derrick]| [4, Hood]| [12, Jef]|
|[1, Carter]| [12, Jef]| [4, Hood]| [5, Banks]|[11, Derrick]|
|[1, Carter]| [4, Hood]| [12, Jef]|[11, Derrick]| [5, Banks]|
|[1, Carter]| [12, Jef]| [5, Banks]|[11, Derrick]| [4, Hood]|
|[1, Carter]| [4, Hood]| [12, Jef]| [5, Banks]|[11, Derrick]|
|[1, Carter]|[11, Derrick]| [12, Jef]| [4, Hood]| [5, Banks]|
|[1, Carter]| [12, Jef]|[11, Derrick]| [5, Banks]| [4, Hood]|
|[1, Carter]| [5, Banks]| [4, Hood]|[11, Derrick]| [12, Jef]|
|[1, Carter]|[11, Derrick]| [5, Banks]| [4, Hood]| [12, Jef]|
|[1, Carter]| [5, Banks]|[11, Derrick]| [12, Jef]| [4, Hood]|
|[1, Carter]| [5, Banks]| [12, Jef]|[11, Derrick]| [4, Hood]|
|[1, Carter]| [5, Banks]| [12, Jef]| [4, Hood]|[11, Derrick]|
|[1, Carter]|[11, Derrick]| [5, Banks]| [12, Jef]| [4, Hood]|
|[1, Carter]| [4, Hood]|[11, Derrick]| [5, Banks]| [12, Jef]|
|[1, Carter]|[11, Derrick]| [4, Hood]| [5, Banks]| [12, Jef]|
|[1, Carter]| [12, Jef]| [5, Banks]| [4, Hood]|[11, Derrick]|
|[1, Carter]| [12, Jef]|[11, Derrick]| [4, Hood]| [5, Banks]|
|[1, Carter]| [4, Hood]|[11, Derrick]| [12, Jef]| [5, Banks]|
|[1, Carter]|[11, Derrick]| [4, Hood]| [12, Jef]| [5, Banks]|
|[1, Carter]| [12, Jef]| [4, Hood]|[11, Derrick]| [5, Banks]|
+-----------+-------------+-------------+-------------+-------------+