3

我使用sbt assembly.

我正在使用 Spark,这似乎是这个问题的根源。

val Spark =  Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-sql" % sparkVersion,
  "org.apache.spark" %% "spark-streaming" % sparkVersion
)

错误:

[error] 12 errors were encountered during merge
[trace] Stack trace suppressed: run last coreBackend/*:assembly for the full output.
[trace] Stack trace suppressed: run last core/*:assembly for the full output.
[trace] Stack trace suppressed: run last commons/*:assembly for the full output.
[error] (coreBackend/*:assembly) deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.osgi/org.osgi.core/org.osgi.core-4.3.1.jar:OSGI-OPT/bnd.bnd
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.osgi/org.osgi.compendium/org.osgi.compendium-4.3.1.jar:OSGI-OPT/bnd.bnd
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/bundles/com.google.guava/guava/guava-18.0.jar:com/google/common/base/Absent.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-network-common_2.11/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Absent.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/bundles/com.google.guava/guava/guava-18.0.jar:com/google/common/base/Function.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-network-common_2.11/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Function.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/bundles/com.google.guava/guava/guava-18.0.jar:com/google/common/base/Optional$1$1.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-network-common_2.11/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Optional$1$1.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/bundles/com.google.guava/guava/guava-18.0.jar:com/google/common/base/Optional$1.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-network-common_2.11/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Optional$1.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/bundles/com.google.guava/guava/guava-18.0.jar:com/google/common/base/Optional.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-network-common_2.11/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Optional.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/bundles/com.google.guava/guava/guava-18.0.jar:com/google/common/base/Present.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-network-common_2.11/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Present.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/bundles/com.google.guava/guava/guava-18.0.jar:com/google/common/base/Supplier.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-network-common_2.11/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Supplier.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.hadoop/hadoop-yarn-common/hadoop-yarn-common-2.2.0.jar:org/apache/hadoop/yarn/factories/package-info.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.hadoop/hadoop-yarn-api/hadoop-yarn-api-2.2.0.jar:org/apache/hadoop/yarn/factories/package-info.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.hadoop/hadoop-yarn-common/hadoop-yarn-common-2.2.0.jar:org/apache/hadoop/yarn/factory/providers/package-info.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.hadoop/hadoop-yarn-api/hadoop-yarn-api-2.2.0.jar:org/apache/hadoop/yarn/factory/providers/package-info.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.hadoop/hadoop-yarn-common/hadoop-yarn-common-2.2.0.jar:org/apache/hadoop/yarn/util/package-info.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.hadoop/hadoop-yarn-api/hadoop-yarn-api-2.2.0.jar:org/apache/hadoop/yarn/util/package-info.class
[error] deduplicate: different file contents found in the following:
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-core_2.11/spark-core_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-launcher_2.11/spark-launcher_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.spark-project.spark/unused/unused-1.0.0.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-network-common_2.11/spark-network-common_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-network-shuffle_2.11/spark-network-shuffle_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-unsafe_2.11/spark-unsafe_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-sql_2.11/spark-sql_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-catalyst_2.11/spark-catalyst_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Volumes/COYOTE/Developer/tibra/lib_managed/jars/org.apache.spark/spark-streaming_2.11/spark-streaming_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] (core/*:assembly) deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/org.osgi/org.osgi.core/jars/org.osgi.core-4.3.1.jar:OSGI-OPT/bnd.bnd
[error] /Users/bryan/.ivy2/cache/org.osgi/org.osgi.compendium/jars/org.osgi.compendium-4.3.1.jar:OSGI-OPT/bnd.bnd
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/com.google.guava/guava/bundles/guava-18.0.jar:com/google/common/base/Absent.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-network-common_2.11/jars/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Absent.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/com.google.guava/guava/bundles/guava-18.0.jar:com/google/common/base/Function.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-network-common_2.11/jars/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Function.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/com.google.guava/guava/bundles/guava-18.0.jar:com/google/common/base/Optional$1$1.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-network-common_2.11/jars/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Optional$1$1.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/com.google.guava/guava/bundles/guava-18.0.jar:com/google/common/base/Optional$1.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-network-common_2.11/jars/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Optional$1.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/com.google.guava/guava/bundles/guava-18.0.jar:com/google/common/base/Optional.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-network-common_2.11/jars/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Optional.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/com.google.guava/guava/bundles/guava-18.0.jar:com/google/common/base/Present.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-network-common_2.11/jars/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Present.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/com.google.guava/guava/bundles/guava-18.0.jar:com/google/common/base/Supplier.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-network-common_2.11/jars/spark-network-common_2.11-1.5.1.jar:com/google/common/base/Supplier.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/org.apache.hadoop/hadoop-yarn-common/jars/hadoop-yarn-common-2.2.0.jar:org/apache/hadoop/yarn/factories/package-info.class
[error] /Users/bryan/.ivy2/cache/org.apache.hadoop/hadoop-yarn-api/jars/hadoop-yarn-api-2.2.0.jar:org/apache/hadoop/yarn/factories/package-info.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/org.apache.hadoop/hadoop-yarn-common/jars/hadoop-yarn-common-2.2.0.jar:org/apache/hadoop/yarn/factory/providers/package-info.class
[error] /Users/bryan/.ivy2/cache/org.apache.hadoop/hadoop-yarn-api/jars/hadoop-yarn-api-2.2.0.jar:org/apache/hadoop/yarn/factory/providers/package-info.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/org.apache.hadoop/hadoop-yarn-common/jars/hadoop-yarn-common-2.2.0.jar:org/apache/hadoop/yarn/util/package-info.class
[error] /Users/bryan/.ivy2/cache/org.apache.hadoop/hadoop-yarn-api/jars/hadoop-yarn-api-2.2.0.jar:org/apache/hadoop/yarn/util/package-info.class
[error] deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-core_2.11/jars/spark-core_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-launcher_2.11/jars/spark-launcher_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Users/bryan/.ivy2/cache/org.spark-project.spark/unused/jars/unused-1.0.0.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-network-common_2.11/jars/spark-network-common_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-network-shuffle_2.11/jars/spark-network-shuffle_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-unsafe_2.11/jars/spark-unsafe_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-sql_2.11/jars/spark-sql_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-catalyst_2.11/jars/spark-catalyst_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Users/bryan/.ivy2/cache/org.apache.spark/spark-streaming_2.11/jars/spark-streaming_2.11-1.5.1.jar:org/apache/spark/unused/UnusedStubClass.class
[error] (commons/*:assembly) deduplicate: different file contents found in the following:
[error] /Users/bryan/.ivy2/cache/org.osgi/org.osgi.core/jars/org.osgi.core-4.3.1.jar:OSGI-OPT/bnd.bnd
[error] /Users/bryan/.ivy2/cache/org.osgi/org.osgi.compendium/jars/org.osgi.compendium-4.3.1.jar:OSGI-OPT/bnd.bnd

我在这里尝试了所有推荐的解决方案,但没有成功。

sbt-assembly:发现重复数据删除错误

去重 commons-validator - sbt 程序集

spark + sbt-assembly:“去重:在下面找到不同的文件内容”

4

4 回答 4

3

这不完全是问题的答案,但它是一种解决方法。

我希望这可以节省几百个工时。

使用sbt-native-packager而不是sbt-assembly.

添加到plugins.sbt

addSbtPlugin("com.typesafe.sbt" % "sbt-native-packager" % "1.0.0")

在你的build.sbt

enablePlugins(JavaAppPackaging)
enablePlugins(UniversalPlugin)

要为多个 Scala 版本构建文件,请使用 +

+ universal:packageBin

输出将告诉您文件的创建位置。

不幸的是,生成的罐子被压缩了。它不是一个胖罐子。(生成一个胖罐子需要sbt-assembly有同样的问题)

为了克服这个问题,我制作了一个简单的脚本(在 SBT 中),它解压缩生成的文件并将 jar 路径写入文件,以便我可以轻松构建 Spark 提交脚本。

packageBin in TxtFormat := {

    val zippedJar = "core-backend-1.0.zip"
    val basePath = target.value / "universal"

    // Unzip to folder of JARs
    IO.unzip(basePath / zippedJar, basePath)

    val fileMappings = (mappings in Universal).value
    val sparkScriptOut = basePath / s"${packageName.value}.txt"

    // append all mappings to the list
    fileMappings foreach {
        case (file, name) => IO.append(sparkScriptOut, s"core-backend-1.0/$name${IO.Newline}")
    }
    sparkScriptOut
}

构建 zip 后,使用它来执行任务:

+ txtFormat:packageBin
于 2015-10-02T22:28:21.827 回答
1

Spark依赖项应该由集群提供,添加“Provided”:

val Spark =  Seq(
    "org.apache.spark" %% "spark-core" % sparkVersion % Provided,
    "org.apache.spark" %% "spark-sql" % sparkVersion % Provided,
    "org.apache.spark" %% "spark-streaming" % sparkVersion % Provided
)
于 2017-02-02T09:32:09.017 回答
1
assemblyMergeStrategy in assembly := {
case PathList("org", "apache", xs @ _*) => MergeStrategy.last
case PathList("com", "google", xs @ _*) => MergeStrategy.last
case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}

这对我有用。

于 2016-07-10T11:02:21.617 回答
1

我知道这是一个老问题,但到目前为止给出的解决方案都没有处理我遇到的情况。

似乎在某些情况下,未“提供”的 jar 依赖于已提供的 jar。如果传递依赖项拉入内容冲突的 jar,您会看到同样的错误。在某些情况下,将这些 jars 添加为标记为“已提供”的依赖项并没有解决在构建程序集时它们之间的冲突。我发现的唯一解决方案是将它们从程序集中明确排除,如下所示。

assemblyExcludedJars in assembly := {
    val cp = (fullClasspath in assembly).value
    cp filter { el =>
        (el.data.getName == "unused-1.0.0.jar") ||
        (el.data.getName == "spark-tags_2.11-2.1.0.jar")  
    }
}
于 2017-02-04T13:22:08.230 回答