4

一个小到可以加载到 R 中的数据帧,dbWriteTable如果它接近最大可用 RAM 量,在调用期间仍然偶尔会达到内存限制上限。我想知道是否有比像下面的代码那样将表格分块读入 RAM 更好的解决方案?

我正在尝试编写可在旧计算机上运行的代码,因此我正在使用 Windows 32 位版本的 R 来重新创建这些内存错误。

# this example will only work on a computer with at least 3GB of RAM
# because it intentionally maxes out the 32-bit limit

# create a data frame that's barely fits inside 32-bit R's memory capacity
x <- mtcars[ rep( seq( nrow( mtcars ) ) , 400000 ) , ]

# check how many records this table contains..
nrow( x )

# create a connection to a SQLite database
# not stored in memory
library( RSQLite )
tf <- tempfile()
db <- dbConnect( SQLite() , tf )


# storing `x` in the database with dbWriteTable breaks.
# this line causes a memory error
# dbWriteTable( db , 'x' , x )

# but storing it in chunks works!
chunks <- 100

starts.stops <- floor( seq( 1 , nrow( x ) , length.out = chunks ) )


for ( i in 2:( length( starts.stops ) )  ){

    if ( i == 2 ){
        rows.to.add <- ( starts.stops[ i - 1 ] ):( starts.stops[ i ] )
    } else {
        rows.to.add <- ( starts.stops[ i - 1 ] + 1 ):( starts.stops[ i ] )
    }

    # storing `x` in the database with dbWriteTable in chunks works.
    dbWriteTable( db , 'x' , x[ rows.to.add , ] , append = TRUE )
}


# and it's the correct number of lines.
dbGetQuery( db , "select count(*) from x" )
4

0 回答 0