2

我正在尝试在会话rsparkling期间通过 H2o (使用 library )使用一些机器学习功能。sparklyr我正在运行 hadoop 集群。

考虑以下示例:

library(dplyr)
library(sparklyr)
library(rsparkling)
library(h2o)

#configure the spark session and connect
sc = spark_connect(master = 'yarn-client',
                   spark_home = '/usr/hdp/current/spark-client',
                   app_name = 'sparklyr',
                   config = list(
                     "sparklyr.shell.executor-memory" = "1G",
                     "sparklyr.shell.driver-memory"   = "4G",
                     "spark.driver.maxResultSize"     = "2G" # may need to transfer a lot of data into R
                   )
)

mtcars_tbl <- copy_to(sc, mtcars, "mtcars")

mtcars_hf <- as_h2o_frame(sc=sc,x=mtcars_tbl,name='h_cars')

我收到以下错误:

Error: java.lang.IllegalArgumentException: Unsupported argument: (spark.dynamicAllocation.enabled,true)
        at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.apply(InternalBackendUtils.scala:48)
        at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.apply(InternalBackendUtils.scala:40)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.apache.spark.h2o.backends.internal.InternalBackendUtils$class.checkUnsupportedSparkOptions(InternalBackendUtils.scala:40)
        at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkUnsupportedSparkOptions(InternalH2OBackend.scala:31)
        at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkAndUpdateConf(InternalH2OBackend.scala:61)
        at org.apache.spark.h2o.H2OContext.<init>(H2OContext.scala:96)
        at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:294)
        at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:316)
        at org.apache.spark.h2o.H2OContext.getOrCreate(H2OContext.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sparklyr.Invoke$.invoke(invoke.scala:94)
        at sparklyr.StreamHandler$.handleMethodCall(stream.scala:89)
        at sparklyr.StreamHandler$.read(stream.scala:55)
        at sparklyr.BackendHandler.channelRead0(handler.scala:49)
        at sparklyr.BackendHandler.channelRead0(handler.scala:14)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
        at java.lang.Thread.run(Thread.java:745)

有什么想法吗?

4

2 回答 2

2

目前 Sparkling Water/RSparkling 不支持动态 Spark 集群。所以你只需要禁用它:

config = list("spark.dynamicAllocation.enabled" = "false")

于 2017-04-27T19:52:08.067 回答
0

对于 Python 用户:

conf = H2OConf(spark).set('spark.dynamicAllocation.enabled', False)   # Default of True causes this error:  IllegalArgumentException: 'Unsupported argument: (spark.dynamicAllocation.enabled,true)'
hc = H2OContext.getOrCreate(spark, conf)   
于 2020-02-05T18:08:51.843 回答