9

我不知道这是一个integer64(来自bit64)问题,还是一个融化问题(reshape2来自:

library(bit64)
library(reshape2)

DF = data.frame(I =letters, Num1 = as.integer64(1:26), Num2 = as.integer64(1:26))
DFM = melt(DF, id.vars = "I")

sapply(DF, class)
sapply(DFM, class)

给出:

> sapply(DF, class)
          I        Num1        Num2 
   "factor" "integer64" "integer64" 
> sapply(DFM, class)
        I  variable     value 
 "factor"  "factor" "numeric" 

并且因为 integer64 是 double 下面,数据被“损坏”

> DF
   I Num1 Num2
1  a    1    1
2  b    2    2
3  c    3    3
4  d    4    4
5  e    5    5
...
> DFM
   I variable         value
1  a     Num1 4.940656e-324
2  b     Num1 9.881313e-324
3  c     Num1 1.482197e-323
4  d     Num1 1.976263e-323
5  e     Num1 2.470328e-323
6  f     Num1 2.964394e-323

这是什么原因造成的?这是integer64问题还是melt问题?创建类时可以做些什么来避免这种事情?

4

3 回答 3

5

这似乎是该软件包的限制,在第 9 页的文档中也提到了这一点。例如:

x <- data.frame(a=as.integer64(1:5), b=as.integer64(1:5))
> x
#   a b
# 1 1 1
# 2 2 2
# 3 3 3
# 4 4 4
# 5 5 5

> unlist(x)

#            a1            a2            a3            a4            a5            b1 
# 4.940656e-324 9.881313e-324 1.482197e-323 1.976263e-323 2.470328e-323 4.940656e-324 
#            b2            b3            b4            b5 
# 9.881313e-324 1.482197e-323 1.976263e-323 2.470328e-323 

> as.matrix(x)
#                  a             b
# [1,] 4.940656e-324 4.940656e-324
# [2,] 9.881313e-324 9.881313e-324
# [3,] 1.482197e-323 1.482197e-323
# [4,] 1.976263e-323 1.976263e-323
# [5,] 2.470328e-323 2.470328e-323

x <- as.integer64(1:5)

> is.vector(x)
# [1] FALSE

> as.vector(x)
# [1] 4.940656e-324 9.881313e-324 1.482197e-323 1.976263e-323 2.470328e-323
于 2013-02-15T11:28:47.247 回答
5

重置课程似乎“纠正”了结果,见下文。但是,正如讨论中提到的,如果数值还包含除integer64.

> class(DFM$value) <- "integer64"
> DFM
   I variable value
1  a     Num1     1
2  b     Num1     2
3  c     Num1     3
于 2013-02-15T11:29:28.930 回答
3

我也可以重现。

不是解决方案,但问题似乎发生在以下melt.data.frame函数行:

value <- unlist(unname(data[var$measure]))

在您的示例中,这导致:

unlist(unname(DF[c("Num1","Num2")]))

并且unlist调用改变了数据的类别。正如帮助页面所说:

 The output type is determined from the highest type of the
 components in the hierarchy NULL < raw < logical < integer < real
 < complex < character < list < expression, after coercion of
 pairlists to lists.
于 2013-02-15T11:03:59.660 回答