我在 R 中有一个奇怪的问题。
我有一个大的 data.table dataTs1 :
Classes ‘data.table’ and 'data.frame': 419172 obs. of 5 variables:
$ TimeStamp: chr "01MAR13:07:15:00" "01MAR13:07:16:00" "01MAR13:07:18:00" ...
$ col1 : chr "ALL1" "ALL1" "ALL1" "ALL1" ...
$ col2 : int NA NA NA NA NA NA NA NA NA NA ...
$ col3 : int 4 4 4 4 4 4 4 4 4 4 ...
$ col4 : int 621 810 4 4 8 1 3 1 1 1 ...
fread
我使用函数加载了这个表。
内存分配似乎没问题。
> memory.size(max=TRUE)
[1] 82.94
我试图将第一行的类修改为 POSIX,所以我写道:
dataTs1$TimeStamp <- strptime(dataTs1$TimeStamp,"%d%b%y:%H:%M:%S")
通过这条线,我达到了 16G 的内存限制......但是当我写的时候:
test <- 1:length(dataTs1$TimeStamp)
dataTs1$TimeStamp <- test
它完美地工作,没有任何内存过载。
我对 R 很陌生,如果你能帮我弄清楚我在这里做错了什么,我将不胜感激。
谢谢
编辑 :
实际上,当我没有内存过载时,有时我会收到一个奇怪的警告:
>dataTs1[,TimeStamp:=strptime(TimeStamp,"%d%b%y:%H:%M:%S")]
Warning messages:
1: In `[<-.data.table`(x, j = name, value = value) :
Supplied 9 items to be assigned to 419172 items of column 'TimeStamp' (recycled leaving remainder of 6 items).
2: In `[<-.data.table`(x, j = name, value = value) :
Coerced 'list' RHS to 'character' to match the column's type. Either change the target column to 'list' first (by creating a new 'list' vector length 419172 (nrows of entire table) and assign that; i.e. 'replace' column), or coerce RHS to 'character' (e.g. 1L, NA_[real|integer]_, as.*, etc) to make your intent clear and for speed. Or, set the column type correctly up front when you create the table and stick to it, please.
> str(dataTs1)
Classes ‘data.table’ and 'data.frame': 419172 obs. of 5 variables:
$ TimeStamp: chr "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ ...
$ V6FCDSB : chr "ALL1" "ALL1" "ALL1" "ALL1" ...
$ V6FCDTD : int NA NA NA NA NA NA NA NA NA NA ...
$ _TYPE_ : int 4 4 4 4 4 4 4 4 4 4 ...
$ N : int 621 810 4 4 8 1 3 1 1 1 ...
- attr(*, ".internal.selfref")=<externalptr>