我正在使用 spark 1.3.1,我想将数据作为 ORC 格式存储在 hive 中。
下面显示错误的行,看起来 orc 不支持作为 spark 1.3.1 中的数据源
dataframe.save("/apps/hive/warehouse/person_orc_table_5", "orc");
java.lang.RuntimeException: Failed to load class for data source: orc
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:194)
at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:237)
at org.apache.spark.sql.DataFrame.save(DataFrame.scala:1196)
at org.apache.spark.sql.DataFrame.save(DataFrame.scala:1156)
at SparkOrcHive.main(SparkOrcHive.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:577)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:174)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Spark 1.4 有..
write.format("orc").partitionBy("age").save("peoplePartitioned")
存储为兽人格式..
有没有办法在 spark 1.3.1 中以 ORC 格式存储文件?
谢谢,