1

我安装toree了 pip 并将 Spark 二进制文件解压缩到

/home/ebe/.bin/spark-2.3.0-bin-hadoop2.7

上述路径存储在名为$SPARK_HOME. 执行以下命令安装 Jupyter 内核

jupyter toree install --spark_home=$SPARK_HOME/ --user

当我启动 Jupyter Notebook(或 Jupyter Lab)并打开一个新的 Apache Spark Scala 笔记本时,内核似乎没有激活。控制台中会弹出以下错误消息。

[I 10:56:44.388 LabApp] Creating new notebook in /
[I 10:56:44.873 LabApp] Kernel started: f65565b1-3570-48a2-be7e-2756a058e156
Starting Spark Kernel with SPARK_HOME=/home/ebe/.bin/spark-2.3.0-bin-hadoop2.7/
2018-06-01 10:56:45 WARN  Utils:66 - Your hostname, Jackdaw resolves to a loopback address: 127.0.1.1; using 192.168.1.247 instead (on interface eno1)
2018-06-01 10:56:45 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2018-06-01 10:56:46 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-06-01 10:56:46 INFO  Main$$anon$1:161 - Kernel version: 0.1.0-incubating
2018-06-01 10:56:46 INFO  Main$$anon$1:162 - Scala version: Some(2.10.4)
2018-06-01 10:56:46 INFO  Main$$anon$1:163 - ZeroMQ (JeroMQ) version: 3.2.5
2018-06-01 10:56:46 INFO  Main$$anon$1:70 - Initializing internal actor system
Exception in thread "main" java.lang.NoSuchMethodError: scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;
    at akka.actor.ActorCell$.<init>(ActorCell.scala:336)
    at akka.actor.ActorCell$.<clinit>(ActorCell.scala)
    at akka.actor.RootActorPath.$div(ActorPath.scala:185)
    at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:465)
    at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:453)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$2.apply(DynamicAccess.scala:78)
    at scala.util.Try$.apply(Try.scala:192)
    at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:73)
    at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
    at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
    at scala.util.Success.flatMap(Try.scala:231)
    at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:84)
    at akka.actor.ActorSystemImpl.liftedTree1$1(ActorSystem.scala:585)
    at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:578)
    at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
    at akka.actor.ActorSystem$.apply(ActorSystem.scala:109)
    at org.apache.toree.boot.layer.StandardBareInitialization$class.createActorSystem(BareInitialization.scala:71)
    at org.apache.toree.Main$$anon$1.createActorSystem(Main.scala:34)
    at org.apache.toree.boot.layer.StandardBareInitialization$class.initializeBare(BareInitialization.scala:60)
    at org.apache.toree.Main$$anon$1.initializeBare(Main.scala:34)
    at org.apache.toree.boot.KernelBootstrap.initialize(KernelBootstrap.scala:70)
    at org.apache.toree.Main$delayedInit$body.apply(Main.scala:39)
    at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
    at scala.App$class.main(App.scala:76)
    at org.apache.toree.Main$.main(Main.scala:23)
    at org.apache.toree.Main.main(Main.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[W 10:56:54.895 LabApp] Timeout waiting for kernel_info reply from f65565b1-3570-48a2-be7e-2756a058e156

Scala version: Some(2.10.4)当 Spark 二进制文件中的 Scala 版本为 2.11时,为什么内核尝试启动()时 Scala 版本不同?

Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_172

甚至控制台中的 Scala 版本也是最新的。

$ scala -version
Scala code runner version 2.12.5-20180321-173609-unknown -- Copyright 2002-2018, LAMP/EPFL and Lightbend, Inc.

我尝试安装不同版本的 Toree 并解决相同的问题。

如何解决这个问题?

操作系统:Manjaro Linux。

4

1 回答 1

2

这对我来说也是一个巨大的痛苦。问题似乎是最新发布的 Toree 版本还不支持 spark 2.x。

解决方案是从源代码安装它。这个要点将引导您完成在 ubuntu 上安装它的步骤:https ://gist.github.com/mikecroucher/b57a9e5a4c1a1a2045f30a901b186bdf

简短的版本是:

安装 sbt:https ://www.scala-sbt.org/1.0/docs/Setup.html

git clone https://github.com/apache/incubator-toree
cd incubator-toree/
make dist
make release

如果你得到它,请忽略以下错误:

/bin/sh: 1: docker: not found
Makefile:212: recipe for target 'dist/toree-pip/toree-0.2.0.dev1.tar.gz' failed
make: *** [dist/toree-pip/toree-0.2.0.dev1.tar.gz] Error 127

然后:

cd dist/toree-pip/
python setup.py install

终于可以安装 toree 了:

jupyter toree install --kernel_name=bespoke_spark --spark_home=/path/to/spark  --user

作为奖励,请记住添加:

spark.sql.catalogImplementation hive

到您的 spark 默认配置,以便您可以连接到 hive(如果需要)。

于 2018-06-03T20:12:42.577 回答