我正在尝试使用 Spark 创建与 IBM COS(云对象存储)的连接。Spark 版本 = 2.4.4,Scala 版本 = 2.11.12。
我使用正确的凭据在本地运行它,但我观察到以下错误 - “方案没有文件系统:cos”
我正在共享代码片段以及错误日志。有人可以帮我解决这个问题。
提前致谢 !
代码片段:
import com.ibm.ibmos2spark.CloudObjectStorage
import org.apache.spark.sql.SparkSession
object CosConnection extends App{
var credentials = scala.collection.mutable.HashMap[String, String](
"endPoint"->"ENDPOINT",
"accessKey"->"ACCESSKEY",
"secretKey"->"SECRETKEY"
)
var bucketName = "FOO"
var objectname = "xyz.csv"
var configurationName = "softlayer_cos"
val spark = SparkSession
.builder()
.appName("Connect IBM COS")
.master("local")
.getOrCreate()
spark.sparkContext.hadoopConfiguration.set("fs.stocator.scheme.list", "cos")
spark.sparkContext.hadoopConfiguration.set("fs.stocator.cos.impl", "com.ibm.stocator.fs.cos.COSAPIClient")
spark.sparkContext.hadoopConfiguration.set("fs.stocator.cos.scheme", "cos")
var cos = new CloudObjectStorage(spark.sparkContext, credentials, configurationName=configurationName)
var dfData1 = spark.
read.format("org.apache.spark.sql.execution.datasources.csv.CSVFileFormat").
option("header", "true").
option("inferSchema", "true").
load(cos.url(bucketName, objectname))
dfData1.printSchema()
dfData1.show(5,0)
}
错误:
Exception in thread "main" java.io.IOException: No FileSystem for scheme: cos
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)