我是 Scala/Spark 的新手,所以请放轻松:)
我正在尝试在 AWS 上运行 EMR 集群,运行我打包的 jar 文件sbt package
。当我在本地运行代码时,它工作得非常好,但是当我在 AWS EMR 集群中运行它时,我收到一个错误:
ERROR Client: Application diagnostics message: User class threw exception: java.lang.NoClassDefFoundError: upickle/core/Types$Writer
据我了解,此错误源于 scala/spark 版本的依赖关系。
所以我使用 Scala 2.12 和 spark 3.0.1,在 AWS 中我使用的是 emr-6.2.0。
这是我的 build.sbt:
scalaVersion := "2.12.14"
libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.11.792"
libraryDependencies += "com.amazonaws" % "aws-java-sdk-core" % "1.11.792"
libraryDependencies += "org.apache.hadoop" % "hadoop-aws" % "3.3.0"
libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "3.3.0"
libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "3.3.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "3.0.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.0.1"
libraryDependencies += "com.lihaoyi" %% "upickle" % "1.4.1"
libraryDependencies += "com.lihaoyi" %% "ujson" % "1.4.1"
我错过了什么?
谢谢!