Apache Toree and Spark Scala Not Working in Jupyte

2020-02-28 23:00发布

问题:

I'm having problems running Scala Spark on Jupyter. Below is my error message when I load Apache Toree - Scala notebook in jupyter.

root@ubuntu-2gb-sgp1-01:~# jupyter notebook --ip 0.0.0.0 --port 8888
[I 03:14:54.281 NotebookApp] Serving notebooks from local directory: /root
[I 03:14:54.281 NotebookApp] 0 active kernels
[I 03:14:54.281 NotebookApp] The Jupyter Notebook is running at: http://0.0.0.0:8888/
[I 03:14:54.281 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 03:14:54.282 NotebookApp] No web browser found: could not locate runnable browser.
[I 03:15:09.976 NotebookApp] 302 GET / (61.6.68.44) 1.21ms
[I 03:15:15.924 NotebookApp] Creating new notebook in
[W 03:15:16.592 NotebookApp] 404 GET /nbextensions/widgets/notebook/js/extension.js?v=20161120031454 (61.6.68.44) 15.49ms referer=http://188.166.235.21:8888/notebooks/Untitled2.ipynb?kernel_name=apache_toree_scala
[I 03:15:16.677 NotebookApp] Kernel started: 94a63354-d294-4de7-a12c-2e05905e0c45
Starting Spark Kernel with SPARK_HOME=/usr/local/spark
16/11/20 03:15:18 [INFO] o.a.t.Main$$anon$1 - Kernel version: 0.1.0.dev8-incubating-SNAPSHOT
16/11/20 03:15:18 [INFO] o.a.t.Main$$anon$1 - Scala version: Some(2.10.4)
16/11/20 03:15:18 [INFO] o.a.t.Main$$anon$1 - ZeroMQ (JeroMQ) version: 3.2.2
16/11/20 03:15:18 [INFO] o.a.t.Main$$anon$1 - Initializing internal actor system
Exception in thread "main" java.lang.NoSuchMethodError: scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;
        at akka.actor.ActorCell$.<init>(ActorCell.scala:336)
        at akka.actor.ActorCell$.<clinit>(ActorCell.scala)
        at akka.actor.RootActorPath.$div(ActorPath.scala:185)
        at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:465)
        at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:453)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$2.apply(DynamicAccess.scala:78)
        at scala.util.Try$.apply(Try.scala:192)
        at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:73)
        at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
        at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
        at scala.util.Success.flatMap(Try.scala:231)
        at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:84)
        at akka.actor.ActorSystemImpl.liftedTree1$1(ActorSystem.scala:585)
        at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:578)
        at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
        at akka.actor.ActorSystem$.apply(ActorSystem.scala:109)
        at org.apache.toree.boot.layer.StandardBareInitialization$class.createActorSystem(BareInitialization.scala:71)
        at org.apache.toree.Main$$anon$1.createActorSystem(Main.scala:35)
        at org.apache.toree.boot.layer.StandardBareInitialization$class.initializeBare(BareInitialization.scala:60)
        at org.apache.toree.Main$$anon$1.initializeBare(Main.scala:35)
        at org.apache.toree.boot.KernelBootstrap.initialize(KernelBootstrap.scala:72)
        at org.apache.toree.Main$delayedInit$body.apply(Main.scala:40)
        at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
        at scala.App$$anonfun$main$1.apply(App.scala:76)
        at scala.App$$anonfun$main$1.apply(App.scala:76)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
        at scala.App$class.main(App.scala:76)
        at org.apache.toree.Main$.main(Main.scala:24)
        at org.apache.toree.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[W 03:15:26.738 NotebookApp] Timeout waiting for kernel_info reply from 94a63354-d294-4de7-a12c-2e05905e0c45

When running Scala shell, this is my output logs

root@ubuntu-2gb-sgp1-01:~# spark-shell
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/11/20 03:17:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/20 03:17:12 WARN Utils: Your hostname, ubuntu-2gb-sgp1-01 resolves to a loopback address: 127.0.1.1; using 10.15.0.5 instead (on interface eth0)
16/11/20 03:17:12 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/11/20 03:17:13 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
Spark context Web UI available at http://10.15.0.5:4040
Spark context available as 'sc' (master = local[*], app id = local-1479611833426).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.0.2
      /_/

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_111)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

This problem was highlighted before in jira https://issues.apache.org/jira/browse/TOREE-336 . However, I'm still unable to get it working for some reason.

I followed the instructions listed on their official site. https://toree.apache.org/documentation/user/quick-start

This is my path

scala> root@ubuntu-2gb-sgp1-01:~# echo $PATH
/root/bin:/root/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/spark:/usr/local/spark/bin

Please note I didnt install Scala as it comes with spark.

Thanks

回答1:

We haven't used Spark 2.0 in production yet with Scala 2.11 and notebooks. The root cause you your error is in compatibility. Based on GitHub Toree description, the latest Scala version that is supported is Scala 2.10.4 and you have 2.11.8. Try to downgrade it to 2.10 if it is not a production need to use only 2.11