我的问题是关于在没有 R 依赖的情况下在 spark 中运行 sparkR 程序的可行性。
换句话说,当机器中没有安装 R 解释器时,我可以在 spark 中运行以下程序吗?
#set env var
Sys.setenv(SPARK_HOME="/home/fazlann/Downloads/spark-1.5.0-bin-hadoop2.6")
#Tell R where to find sparkR package
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"), .libPaths()))
#load sparkR into this environment
library(SparkR)
#create the sparkcontext
sc <- sparkR.init(master = "local")
#to work with DataFrames we will need a SQLContext, which can be created from the SparkContext
sqlContext <- sparkRSQL.init(sc)
name <- c("Nimal","Kamal","Ashen","lan","Harin","Vishwa","Malin")
age <- c(23,24,12,25,31,22,43)
child <- c(TRUE,TRUE,FALSE,FALSE,TRUE,FALSE,TRUE)
localdf <- data.frame(name,age,child)
#convert R dataframe into spark DataFrame
sparkdf <- createDataFrame(sqlContext,localdf);
#since we are passing a spark DataFrame into head function, the method gets executed in spark
head(sparkdf)