希望使用Spark 安装scalac
中包含的自定义(Scala 编译器) ;sparklyr
在 RStudio SparkUI 选项卡中找到(或从spark_web(sc)
)>>环境>>/jars/scala-compiler-2.11.8.jar
作为“系统环境”——而不是scalac
在基本目录中单独下载和安装——如在此处找到并从 RStudio 链接的“hello world”示例中所建议的那样创建扩展页面http://spark.rstudio.com/extensions.html。
这是我到目前为止使用 Ubuntu 所拥有的,但在下面的错误中停滞不前。我设置了一个与上面“hello world”示例中使用的 Github-repo 完全相同的目录。/opt/scala
知道如何在不安装在建议的基本路径文件夹之一(即、/opt/local/scala
、/usr/local/scala
或~/scala
(仅限 Windows)的情况下)的情况下克服此错误?想要sparklyr
为给定用户使用本机安装和相对路径。
library(titanic)
library(sparklyr)
# spark_web(sc) # Opens Web Console to find Scala Version and scalac
# Sets Working Directory to R folder of file
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
sparkVers <- '2.0.0'; scalaVers <- '2.11.8'; packageName <- "sparkhello"
packageJarExtR <- spark_compilation_spec(spark_version = sparkVers,
spark_home = spark_home_dir(),
scalac_path = paste0(spark_home_dir(),"/jars","/scala-compiler-", scalaVers, ".jar"), #
scala_filter = NULL,
jar_name = sprintf(paste0(getwd(),"/inst/java/", packageName, "-%s-%s.jar"), sparkVers, scalaVers)
)
sparklyr::compile_package_jars(spec = packageJarExtR)
# Error: No root directory found. Test criterion:
# Contains a file 'DESCRIPTION' with contents matching '^Package: '
# In addition: Warning message:
# running command ''/mnt/home/eyeOfTheStorm/.cache/spark/
# spark-2.0.0-bin-hadoop2.7/jars/scala-compiler-2.11.8.jar'
# -version 2>&1' had status 126
###
library(sparkhello)
# Connect to local spark cluster and load data
sc <- spark_connect(master = "local", version = "2.0.0")
titanic_tbl <- copy_to(sc, titanic_train, "titanic", overwrite = TRUE)