Unable to find a sparklyr built in for listing the contents of a directory via Spark, I am attempting to use invoke
:
sc <- spark_connect(master = "yarn", config=config)
path <- 'gs:// ***path to bucket on google cloud*** '
spath <- sparklyr::invoke_new(sc, 'org.apache.hadoop.fs.Path', path)
fs <- sparklyr::invoke(spath, 'getFileSystem')
list <- sparklyr:: invoke(fs, 'listLocatedStatus')
Error: java.lang.Exception: No matched method found for class org.apache.hadoop.fs.Path.getFileSystem
at sparklyr.Invoke.invoke(invoke.scala:134)
at sparklyr.StreamHandler.handleMethodCall(stream.scala:123)
at sparklyr.StreamHandler.read(stream.scala:66) ...
Note: Are there guidelines for reproducible examples with distributed code? I don't know how to make an example others could follow, given I am running against a particular Spark environment.