0

我使用 Spark2.3.2 和 GraphFrames 0.7.0。
我有两个数据框:node2attrDf并且edge2attrDf,生成它们的代码如下:https ://gist.github.com/superPershing/56928c4f5420ea6334d7a9f6e389bda5

他们的架构是这样的:

scala> node2attrDf.printSchema
root
 |-- id: integer (nullable = true)
 |-- combined: array (nullable = true)
 |    |-- element: array (containsNull = true)
 |    |    |-- element: integer (containsNull = false)

scala> edge2attrDf.printSchema
root
 |-- src: integer (nullable = true)
 |-- dst: integer (nullable = true)
 |-- info: struct (nullable = false)
 |    |-- dstNeighbors: array (nullable = true)
 |    |    |-- element: long (containsNull = false)
 |    |-- J: array (nullable = true)
 |    |    |-- element: integer (containsNull = false)
 |    |-- q: array (nullable = true)
 |    |    |-- element: double (containsNull = false)

scala> node2attrDf.show(5)
+---+--------------------+
| id|            combined|
+---+--------------------+
|148|[[405, 3], [121, ...|
|463|[[131, 2], [213, ...|
|471|[[117, 7], [7, 6]...|
|496|[[134, 7], [127, ...|
|833|[[597, 4], [566, ...|
+---+--------------------+
only showing top 5 rows

scala> edge2attrDf.show(5)
+---+---+------------+
|src|dst|        info|
+---+---+------------+
|780|725|[[], [], []]|
|266|351|[[], [], []]|
|285|132|[[], [], []]|
|328|748|[[], [], []]|
|275|487|[[], [], []]|
+---+---+------------+
only showing top 5 rows

当我使用两个数据框创建新的图框时:

val gDF = GraphFrame(node2attrDf, edge2attrDf)

发生错误:

scala> val gDF = GraphFrame(node2attrDf, edge2attrDf)
<console>:31: error: type mismatch;
 found   : org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.DataFrame
    (which expands to)  org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
 required: org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.DataFrame
    (which expands to)  org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
       val gDF = GraphFrame(node2attrDf, edge2attrDf)
                            ^
<console>:31: error: type mismatch;
 found   : org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.DataFrame
    (which expands to)  org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
 required: org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.DataFrame
    (which expands to)  org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
       val gDF = GraphFrame(node2attrDf, edge2attrDf)
                                         ^

似乎找到的类型和所需的类型是相同的。那么为什么会发生这个错误以及如何解决呢?

4

0 回答 0