RJDBC 很好地连接到 Hive 并从 Hive 读取数据。但它不是使用 --> dbWriteTable 将数据写入 Hive。
见下文-
options(java.parameters = "-Xmx8g")
library(DBI)
library(rJava)
library(RJDBC)
cp <- c(list.files("/tmp/R_hive_libs/cloudera_hive_jars", pattern = "[.]jar", full.names=TRUE, recursive=TRUE),list.files("/tmp/R_hive_libs/R_hadoop_libs", pattern = "[.]jar", full.names=TRUE, recursive=TRUE),list.files("/tmp/R_hive_libs/R_hadoop_libs/lib", pattern = "[.]jar", full.names=TRUE, recursive=TRUE), recursive=TRUE)
drv <- JDBC(driverClass = "com.cloudera.hive.jdbc4.HS2Driver", classPath=cp)
conn <- dbConnect(drv, "jdbc:hive2://XXXXXX:10000/default", "user", "password")
show_databases <- dbGetQuery(conn, "show databases")
List_of_Tables <- dbListTables(conn)
data1 <- dbGetQuery(conn, "select * from XXX.xxx limit 10000")
data_to_write_back_to_hive <- data.frame(aggregate(data1$xxx.xxx, by=list(Month=data1$xxx.cmp_created_timestamp_month), FUN=sum))
data_to_write_back_to_hive[[2]] <-c(10,20)
colnames(data_to_write_back_to_hive) <- c("Month", "Energy")
dbWriteTable(conn, "xxxx.checking",data_to_write_back_to_hive)
如何将数据写回hive?它给出了以下错误-
.local(conn, statement, ...) 中的错误:在 dbSendUpdate 中执行 JDBC 更新查询失败([Simba]HiveJDBCDriver 错误处理查询/语句。错误代码:40000,SQL 状态:TStatus(statusCode:ERROR_STATUS,infoMessages:[* org.apache.hive.service.cli.HiveSQLException:编译语句时出错:FAILED:ParseException line 1:36 mismatched input 'PRECISION' Expecting)在创建表语句中的“DOUBLE”附近:28:27, org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:326, org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:102, org.apache.hive。 service.cli.operation.SQLOperation:runInternal:SQLOperation.java:171, org.apache.hive.service.cli.operation.Operation:run:Operation.java:268, org.apache.hive.service.cli.session。 HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:410, org.apache.hive.service.cli.session.HiveSessionImpl:executeStatement:HiveSessionImpl.java:391, sun.reflect.GeneratedMethodAccessor56:invoke::-1, sun.reflect.DelegatingMeth