2

我在自己的包中使用 data.table 包,并在 NAMESPACE 和 DESCRIPTION 文件中导入 data.table 命名空间。在我的一个函数中,我使用 data.table 函数将 data.frame 转换为 data.table

dt <- data.table(df)

但是当我调用我的函数时,在调用 data.table() 时,内存使用量会立即跳跃,而 R 会停止响应。当我逐行运行并且内存消耗低时,函数中的代码运行良好。另外,如果我将 library(data.table) 放在我的函数中,一切都很好。我试图避免将 library(data.table) 放在我的函数中,而是声明依赖关系。但是,这种方式似乎出了点问题。我在 Mac OS X 10.6.8 上运行 R-2.14.0

谁能解释可能是什么原因,我该如何解决(不在我的函数中使用 library(data.table) )?

4

1 回答 1

3

Some random guesses in no particular order :

Try use the Imports or Depends field in DESCRIPTION only. I don't think you need to import in NAMESPACE as well, but I might be wrong. Why that would explain the memory use though, don't know.

What is df? Is it big or somehow recursive or strange in some way? Please provide str(df) to tell us something about it, if possible.

Try as.data.table(df) which is faster than data.table(df). But it sounds like your problem is different to that.

Is your function call being called repeatedly? I can see why repeatedly converting df to dt would use up memory, but not why just calling library(data.table) would make that fast.

Try starting R with R --vanilla to ensure no .Rdata (which may include functions masking data.table's) is being loaded on startup, amongst other things. If you have developed your own package then some kind of function name conflict, or the order of packages on the search() path sounds plausible.

Otherwise we'll need more information please. I don't recall anything similar to this happening to me, or being reported before.

And, which version of data.table are you using? There is this bug fix in v1.8.1 on R-Forge (not yet on CRAN) :

  • Moved data.table setup code from .onAttach to .onLoad so that it is also run when data.table is simply imported from within a package, fixing #1916 related to missing data.table options.

But if you are using 1.8.0 from CRAN, and are Importing (only) rather than Depending then I'd expect you to get an error about missing options rather than a jump in memory consumption.

于 2012-05-17T09:02:21.997 回答