我正在尝试使用 databrick 连接在 IDE 中从我的 databricks 笔记本中运行一些代码。我似乎无法弄清楚如何创建一个简单的数据框。
使用:
import spark.implicits._
var Table_Count = Seq((cdpos_df.count(),I_count,D_count,U_count)).toDF("Table_Count","I_Count","D_Count","U_Count")
给出错误信息value toDF is not a member of Seq[(Long, Long, Long, Long)]
。
尝试从头开始创建数据框:
var dataRow = Seq((cdpos_df.count(),I_count,D_count,U_count))
var schemaRow = List(
StructField("Table_Count", LongType, true),
StructField("I_Count", LongType, true),
StructField("D_Count", LongType, true),
StructField("U_Count", LongType, true)
)
var TableCount = spark.createDataFrame(
sc.parallelize(dataRow),
StructType(schemaRow)
)
给出错误信息
overloaded method value createDataFrame with alternatives:
(data: java.util.List[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.api.java.JavaRDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.rdd.RDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rows: java.util.List[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.api.java.JavaRDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
cannot be applied to (org.apache.spark.rdd.RDD[(Long, Long, Long, Long)], org.apache.spark.sql.types.StructType)