4

我遇到了一些奇怪的事情,特别是因为代码每次运行时可能会给出不同的输出。简而言之,我错误地使用set在一行中设置一个大于最后一个值的值,而不是什么都不做,而是set创建了一个负长度data.table

library(data.table)

dt<-data.table(id=1:5, var=rnorm(5)) # normal example

set(dt, 6L, 1L, 3L) # doesn't set anything as expected.
dt
#
# now my real data, after I found the error in my code (incorrect row number in set)
#
dt1 <- data.table(ID = "29502509", FY = 2012, VAR = 61067.5442975645, 
                      startDate = structure(15062L, class = c("IDate", "Date")), 
                      endDate = structure(15429L, class = c("IDate", "Date")), 
                      start = "1750", end = "2404",
                      date = structure(15461L,class = c("IDate", "Date")),
                      DESCR = "JOB", NOTE = "NEW")

set(dt1, 12L, 3L, 62385.6516144086)
str(dt1)
Classes ‘data.table’ and 'data.frame':  1 obs. of  10 variables:
 $ ID       : chr "29502509"
 $ FY       : num 2012
 $ VAR      : num 61068
 $ startDate: IDate, format: "2011-03-29"
 $ endDate  :
Error in do.call(str, c(list(object = obj), aList, list(...)), quote = TRUE) : 
  negative length vectors are not allowed
> sapply(dt1, length)
        ID         FY        VAR  startDate    endDate      start        end       date 
         1          1          1          1 -637110831          1          1          1 
     DESCR       NOTE 
         1          1 
> dput(dt1)
structure(list(ID = "29502509", FY = 2012, VAR = 61067.5442975645, 
    startDate = structure(15062L, class = c("IDate", "Date")), 
    endDate = structure(, class = c("IDate", "Date")), start = "1750", # HERE
    end = "2404", date = structure(15461L, class = c("IDate", 
    "Date")), DESCR = "JOB", NOTE = "NEW"), .Names = c("ID", 
"FY", "VAR", "startDate", "endDate", "start", "end", "date", 
"DESCR", "NOTE"), row.names = c(NA, -1L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x0000000000130788>)

正如我上面所说,您可能需要运行整个代码几次才能看到,从创建 data.tabledt1 <- data.table(...set(dt1,...,因为我注意到如果第一次没有发生它,除非我重新-运行dt1 <- data.table(...。任何想法?

编辑:

具体来说,当我说不同的结果时,我的意思是有时它什么都不做(如预期的那样),但大多数时候它总是 the Date创建一个负长度列,有时它会创建一个data.table带有负行的整体。另外,在最后两种情况下(单列或整个data.table),负长度总是-637110831

4

1 回答 1

3

由于写入超出为列分配的内存,看起来像内存损坏。

这要求assignin assign.c。从版本 1.8.8 开始,assign.c:434:

434             default :
435                 for (r=0; r<targetlen; r++)
436                     memcpy((char *)DATAPTR(targetcol) + (INTEGER(rows)[r]-1)*size, 
437                            (char *)DATAPTR(RHS) + (r%vlen) * size,
438                            size);

已达到此代码(不应该是这种情况)。在此刻:

(gdb) p INTEGER(rows)[0]
$21 = 12
(gdb) p size
$23 = 8
于 2013-06-03T20:44:28.030 回答