I am trying to write a large dataset (10 cols, 100M records) from R to SAP HANA using RJDBC's dbWritetable in the following way
library("RJDBC")
drv <- JDBC("com.sap.db.jdbc.Driver", "/data/hdbclient/ngdbc.jar", "'")
database <- dbConnect( drv,"jdbc:sap://servername", "USER", "PASS")
dbWriteTable(database, "largeSet", largeSet)
This works, but is extremely slow (75k records per HOUR). I have tested RODBC (sqlsave
) as well and this shows the same issue.
Looking at the code behind dbWriteTable
it seems that writing is record by record (i.e. the same as insert into) and indeed using a line by line insert into using dbSendUpdate
shows the same performance. I have verified that the problem is not in the connection speed itself.
ROracle has a bulk_write
option which seems to solve this issue, but since I am trying to write to HANA I need RJDBC or RODBC.
Can anyone tell me how I can speed up the write to HANA by running a bulk write or some other method?