49

我使用了一些变量,但是当它使用时,我再也不需要它了,所以我需要将它删除并释放内存,但是函数 rm() 似乎没有帮助:

memory.size()
30.69
tmp=matrix(rnorm(6e5*20),6e5,20)
memory.size()
207.64
rm(tmp)
memory.size()
207.64

这是否意味着我删除了tmp但内存没有释放?

4

1 回答 1

54

我用来gc()在操作之间释放 RAM。下面是我如何在循环中使用它的示例,但请参阅此处了解更详细的讨论,gc()了解有关 R 会话期间内存管理的更多信息

# load library
library(topicmodels)

# get data
data("AssociatedPress"))

# set number of topics to start with
k <- 20

# set model options
control_LDA_VEM <-
list(estimate.alpha = TRUE, alpha = 50/k, estimate.beta = TRUE,
verbose = 0, prefix = tempfile(), save = 0, keep = 0,
seed = as.integer(100), nstart = 1, best = TRUE,
var = list(iter.max = 10, tol = 10^-6),
em = list(iter.max = 10, tol = 10^-4),
initialize = "random")


# create the sequence that stores the number of topics to 
# iterate over
sequ <- seq(20, 300, by = 20)

# basic loop to iterate over different topic numbers with gc
# after each run to empty out RAM
lda <- vector(mode='list', length = length(sequ))
for(k in sequ) {
  lda[[k]] <- LDA(AssociatedPress[1:20,], k, method= "VEM", control = control_LDA_VEM)
  gc() # here's where I put the garbage collection to free up memory before the next round of the loop
}

# convert list output to dataframe (suggestions for a simpler method are welcome!)
best.model.logLik <- data.frame(logLik = as.matrix(lapply(lda[sequ], logLik)), ntopic = sequ)

# plot
with(best.model.logLik, plot(ntopic, logLik, type = 'l', xlab="Number of topics", ylab="Log likelihood"))

在此处输入图像描述

# print ordered dataframe to see which number of topics has the highest log likelihood
(best.model.logLik.sort <- best.model.logLik[order(-as.numeric(best.model.logLik$logLik)), ]) 
    logLik       ntopic
2  -17904.12     40
3  -18105.48     60
1  -18181.84     20
4   -18569.7     80
5  -19736.94    100
6   -21919.6    120
7  -23785.08    140
8  -24914.23    160
9  -25493.76    180
10 -25837.64    200
11 -25964.23    220
12 -26061.01    240
13 -26117.92    260
14 -26149.44    280
15 -26168.91    300
于 2013-03-22T04:53:49.247 回答