r - r数据框中行的标识

Question

我想比较两行数据框的身份。我认为 same() 函数适合此任务，但它不能按预期工作。这是一个最小的例子：

x=factor(c("x","x"),levels=c("x","y"))
y=factor(c("y","y"),levels=c("x","y"))
df=data.frame(x,y)
df
  x y
1 x y
2 x y

identical(df[1,],df[2,])
[1] FALSE
> df[1,]==df[2,]
     x    y

1 TRUE TRUE

谁能解释一下为什么 same() 返回 FALSE？

谢谢，托马斯

score 5 · Accepted Answer

identical(df[1,],df[2,])
#[1] FALSE
all.equal(df[1,],df[2,])
#[1] "Attributes: < Component 2: Mean relative difference: 1 >"

all.equal(df[1,],df[2,],check.attributes = FALSE)
#[1] TRUE

anyDuplicated(df[1:2,])>0
#[1] TRUE

score 2 · Accepted Answer

试试这个功能

all.equal(df[1,],df[2,])
[1] "Attributes: < Component 2: Mean relative difference: 1 >"

（一般比较因素可能会产生“意外”的结果......）在这种情况下identity，尝试匹配所有内容，发现不同row.names，您可以从以下内容中看到dput：

> dput(df[1,])
structure(list(x = structure(1L, .Label = c("x", "y"), class = "factor"), 
    y = structure(2L, .Label = c("x", "y"), class = "factor")), .Names = c("x", 
"y"), row.names = 1L, class = "data.frame")
> dput(df[2,])
structure(list(x = structure(1L, .Label = c("x", "y"), class = "factor"), 
    y = structure(2L, .Label = c("x", "y"), class = "factor")), .Names = c("x", 
"y"), row.names = 2L, class = "data.frame")

在这个例子中，一个简单的==作品：

> df[1,]==df[2,]
     x    y
1 TRUE TRUE

r - r数据框中行的标识

2 回答 2

Related

Reference