r - 根据两个协变量级别的对应关系选择数据帧的行

Question

我目前正在研究两个不同的数据帧，其中一个非常长（long）。我需要做的是选择long其对应的所有行在id_type另一个（较小的）数据集中至少出现一次。

假设两个数据帧是：

long <- read.table(text = "
  id_type   x1   x2

   1       0     0  
   1       0     1
   1       1     0
   1       1     1
   2       0     0
   2       0     1
   2       1     0
   2       1     1
   3       0     0  
   3       0     1
   3       1     0
   3       1     1
   4       0     0  
   4       0     1
   4       1     0
   4       1     1", 
header=TRUE)

和

short <- read.table(text = "
  id_type   y1   y2    

   1       5     6    
   1       5     5    
   2       7     9", 
     header=TRUE)

在实践中，我想要获得的是：

 id_type   x1   x2    

  1       0     0  
  1       0     1
  1       1     0
  1       1     1
  2       0     0  
  2       0     1
  2       1     0
  2       1     1

我曾尝试使用out <- long[long[,"id_type"]==short[,"id_type"], ]，但它显然是错误的。你将如何进行？谢谢

score 2 · Accepted Answer

2

只需使用%in%：

out <- long[long$id_type %in% short$id_type, ]

看?"%in%"。

于 2013-01-21T12:27:32.107 回答

score 2 · Accepted Answer

你在哪里失踪%in%：

> long[long$id_type %in% unique(short$id_type),]
  id_type x1 x2
1       1  0  0                                                             
2       1  0  1                                                             
3       1  1  0                                                             
4       1  1  1                                                             
5       2  0  0                                                             
6       2  0  1                                                             
7       2  1  0                                                             
8       2  1  1

r - 根据两个协变量级别的对应关系选择数据帧的行

2 回答 2

Related

Reference