0

第一个数据框看起来像这样

Sample ,       chr  ,  start    , stop<br/>
549151236,  , chr20 ,  27100000 , 28172124<br/>
549151236,  ,chr10  , 45479093  , 47547144<br/>
549151236,  ,chr11  , 50366193  , 51284544<br/>
549151236,  ,chr11  , 90945964  , 91030487<br/>
549151236,  ,chr5   , 9954211   ,  9979517<br/>

第二个数据框看起来像

Sample  ,event, probe, 
549151236, CN gain, 3
549151236, CN gain , 20 

它比示例文件有更多的列。

当我使用合并合并 2 个数据框时(这两个文件中有 3850 行)

testchop=merge(chop_result,newchop, by.x="Sample",by.y="Sample")

...它给了我大约 315565 行,那么如何解决这个问题。

4

1 回答 1

0

在合并中,您可以选择all.xall.y,这可能会有所帮助....

# this will give you all obs which are in both chop_result and newchop
# and only if they are in both (intersection)
merge(chop_result, newchop, by="Sample")

# here all chop_result are in the new data frame, if
# there is a "Sample" number which is not in chop_result
# than it is filled with NA
merge(chop_result, newchop, by="Sample", all.x=TRUE)

# same as above but now all newchop results will be there
merge(chop_result, newchop, by="Sample", all.y=TRUE)

hth

于 2013-08-08T08:17:07.950 回答