11

I am trying to write a large dataset (10 cols, 100M records) from R to SAP HANA using RJDBC's dbWritetable in the following way

library("RJDBC")
drv <- JDBC("com.sap.db.jdbc.Driver", "/data/hdbclient/ngdbc.jar", "'")
database <- dbConnect( drv,"jdbc:sap://servername", "USER", "PASS")

dbWriteTable(database, "largeSet", largeSet)

This works, but is extremely slow (75k records per HOUR). I have tested RODBC (sqlsave) as well and this shows the same issue.

Looking at the code behind dbWriteTable it seems that writing is record by record (i.e. the same as insert into) and indeed using a line by line insert into using dbSendUpdate shows the same performance. I have verified that the problem is not in the connection speed itself.

ROracle has a bulk_write option which seems to solve this issue, but since I am trying to write to HANA I need RJDBC or RODBC.

Can anyone tell me how I can speed up the write to HANA by running a bulk write or some other method?

4

1 回答 1

1

如果您的主要目标是加快速度,而无需进行太多其他更改,则可以切换到该sjdbc软件包,该软件包在这方面的性能要好得多RJDBC(遗憾的是,近年来它并没有引起太多关注)。

当我写这篇文章并查看CRAN时,看起来 Simon 最近才发现它并在一周前发布了一个新版本。这实际上包括对以下方面的改进dbSendUpdate

https://cran.r-project.org/web/packages/RJDBC/NEWS

于 2018-02-01T13:34:35.787 回答