一个小到可以加载到 R 中的数据帧,dbWriteTable
如果它接近最大可用 RAM 量,在调用期间仍然偶尔会达到内存限制上限。我想知道是否有比像下面的代码那样将表格分块读入 RAM 更好的解决方案?
我正在尝试编写可在旧计算机上运行的代码,因此我正在使用 Windows 32 位版本的 R 来重新创建这些内存错误。
# this example will only work on a computer with at least 3GB of RAM
# because it intentionally maxes out the 32-bit limit
# create a data frame that's barely fits inside 32-bit R's memory capacity
x <- mtcars[ rep( seq( nrow( mtcars ) ) , 400000 ) , ]
# check how many records this table contains..
nrow( x )
# create a connection to a SQLite database
# not stored in memory
library( RSQLite )
tf <- tempfile()
db <- dbConnect( SQLite() , tf )
# storing `x` in the database with dbWriteTable breaks.
# this line causes a memory error
# dbWriteTable( db , 'x' , x )
# but storing it in chunks works!
chunks <- 100
starts.stops <- floor( seq( 1 , nrow( x ) , length.out = chunks ) )
for ( i in 2:( length( starts.stops ) ) ){
if ( i == 2 ){
rows.to.add <- ( starts.stops[ i - 1 ] ):( starts.stops[ i ] )
} else {
rows.to.add <- ( starts.stops[ i - 1 ] + 1 ):( starts.stops[ i ] )
}
# storing `x` in the database with dbWriteTable in chunks works.
dbWriteTable( db , 'x' , x[ rows.to.add , ] , append = TRUE )
}
# and it's the correct number of lines.
dbGetQuery( db , "select count(*) from x" )