我的数据如下所示:
> head(dbGetQuery(mydb, 'SELECT * FROM geneExpDiffData WHERE significant = "yes"'))
gene_id sample_1 sample_2 status value_1 value_2 log2_fold_change test_stat p_value q_value significant
1 XLOC_000219 M4 M3 OK 3.85465 0.00000 -Inf NA 5e-05 0.0075951 yes
2 XLOC_004272 M4 M3 OK 2.06687 0.00000 -Inf NA 5e-05 0.0075951 yes
3 XLOC_004991 M4 M3 OK 3.29904 0.00000 -Inf NA 5e-05 0.0075951 yes
4 XLOC_007234 M4 M3 OK 1.28027 0.00000 -Inf NA 5e-05 0.0075951 yes
5 XLOC_000664 M4 F4 OK 1.46853 0.00000 -Inf NA 5e-05 0.0075951 yes
6 XLOC_001809 M4 F4 OK 0.00000 1.91743 Inf NA 5e-05 0.0075951 yes
我用 RSQLite 生成了两个子集:
M4M3 <- dbGetQuery(mydb, 'SELECT * FROM geneExpDiffData WHERE significant = "yes" AND sample_1 = "M4" AND sample_2 = "M3"')
M4F4 <- dbGetQuery(mydb, 'SELECT * FROM geneExpDiffData WHERE significant = "yes" AND sample_1 = "M4" AND sample_2 = "F4"')
我想从 M4M3 中删除所有在 M4F4 中具有匹配基因 ID 的值。我使用 RSQLite 过滤数据集并不重要,它可能是一个纯 R 解决方案,但我不确定如何比较表并从一个基于另一个的行中删除行。
感谢您的任何建议!