1

How to select the rows from data frame which has a value over 1 in a specific column ?

That's how my data looks like:

> dput(head(tbl_comp[,-1]))
structure(list(Meve_mean = c(7774.44229552491, 43374.1166119026, 
585562.72426545, 3866.54724117546, 320338.197537275, 918368.01990607
), Mmor_mean = c(39113.5325249635, 119476.157216344, 1296530.34384725, 
23511.2980313616, 209092.538981888, 577355.581852083), Mtot_mean = c(23443.9874102442, 
81425.1369141232, 941046.53405635, 13688.9226362685, 264715.368259581, 
747861.800879077), tot_meanMe = c(258492586.999527, NA, NA, NA, 
NA, NA), tot_meanMm = c(246665241.110832, NA, NA, NA, NA, NA), 
    tot_sdMe = c(35569170.0311164, NA, NA, NA, NA, NA), tot_sdMm = c(30522099.9189256, 
    NA, NA, NA, NA, NA), Wteve_mean = c(10752.4381084666, 53658.8435672746, 
    715547.921685567, 3422.17220367207, 335384.199178456, 1013708.18845339
    ), Wtmor_mean = c(29254.6414790837, 98804.8007431987, 1001344.20496027, 
    11541.8862121394, 217110.411645861, 571826.157099177), Wttot_mean = c(18681.9538387311, 
    73007.110928385, 838032.04308901, 6902.04963587237, 284695.433093058, 
    824330.175015869), tot_meanwte = c(278901499.672313, NA, 
    NA, NA, NA, NA), tot_meanwtm = c(235415566.775308, NA, NA, 
    NA, NA, NA), tot_sdwte = c(16743477.4011497, NA, NA, NA, 
    NA, NA), tot_sdwtm = c(3922418.43271348, NA, NA, NA, NA, 
    NA), diff_eve = c(0.72303994843767, 0.808331185101342, 0.818341730189196, 
    1.12985174650959, 0.955138012828161, 0.905949098928778), 
    diff_mor = c(1.33700262752933, 1.20921408998001, 1.29478988086689, 
    2.03704122525771, 0.963070068343606, 1.00966976533735), diff_tot = c(1.25490018938172, 
    1.11530419268331, 1.12292428650774, 1.98331269093204, 0.929819510568173, 
    0.907235745512628)), .Names = c("Meve_mean", "Mmor_mean", 
"Mtot_mean", "tot_meanMe", "tot_meanMm", "tot_sdMe", "tot_sdMm", 
"Wteve_mean", "Wtmor_mean", "Wttot_mean", "tot_meanwte", "tot_meanwtm", 
"tot_sdwte", "tot_sdwtm", "diff_eve", "diff_mor", "diff_tot"), row.names = c(NA, 
6L), class = "data.frame")

In this data there are three columns which I am interested mostly in:

diff_eve  diff_mor  diff_tot
0.7230399 1.3370026 1.2549002
0.8083312 1.2092141 1.1153042
0.8183417 1.2947899 1.1229243
1.1298517 2.0370412 1.9833127
0.9551380 0.9630701 0.9298195
0.9059491 1.0096698 0.9072357

I would like to create new data frames selected by the value in each of those column. There should be 6 new data frames. Values below 1 in "diff_eve" column should be in new data frame, same goes to values above 1 = next data frame. Of course I would like to keep all of the columns from the data (tbl_comp).

Let me show the example of the new data frame. Selection by the values below 1 in the column diff_eve:

>newdata.frame
diff_eve  diff_mor  diff_tot  .. ....... .....  rest of the columns from tbl_comp.
0.7230399 1.3370026 1.2549002
0.8083312 1.2092141 1.1153042 
0.8183417 1.2947899 1.1229243
0.9551380 0.9630701 0.9298195
0.9059491 1.0096698 0.9072357

I hope some of you understand what I want to achieve.

4

2 回答 2

3

这是一个解决方案,它按以下顺序为您提供六个数据帧:

diff_eve <1
diff_mor <1
diff_tot <1
diff_eve >1
diff_mor >1
diff_tot >1

代码:

 cols <- c("diff_eve", "diff_mor", "diff_tot")
 c(lapply(cols, function(x)subset(tbl_comp, eval(parse(text=x))<1)), lapply(cols, function(x)subset(tbl_comp, eval(parse(text=x))>1)))

给你

[[1]]
   Meve_mean  Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm  diff_eve  diff_mor  diff_tot
1   7774.442   39113.53  23443.99  258492587  246665241 35569170 30522100   10752.44   29254.64   18681.95   278901500   235415567  16743477   3922418 0.7230399 1.3370026 1.2549002
2  43374.117  119476.16  81425.14         NA         NA       NA       NA   53658.84   98804.80   73007.11          NA          NA        NA        NA 0.8083312 1.2092141 1.1153042
3 585562.724 1296530.34 941046.53         NA         NA       NA       NA  715547.92 1001344.20  838032.04          NA          NA        NA        NA 0.8183417 1.2947899 1.1229243
5 320338.198  209092.54 264715.37         NA         NA       NA       NA  335384.20  217110.41  284695.43          NA          NA        NA        NA 0.9551380 0.9630701 0.9298195
6 918368.020  577355.58 747861.80         NA         NA       NA       NA 1013708.19  571826.16  824330.18          NA          NA        NA        NA 0.9059491 1.0096698 0.9072357

[[2]]
  Meve_mean Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm diff_eve  diff_mor  diff_tot
5  320338.2  209092.5  264715.4         NA         NA       NA       NA   335384.2   217110.4   284695.4          NA          NA        NA        NA 0.955138 0.9630701 0.9298195

[[3]]
  Meve_mean Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm  diff_eve  diff_mor  diff_tot
5  320338.2  209092.5  264715.4         NA         NA       NA       NA   335384.2   217110.4   284695.4          NA          NA        NA        NA 0.9551380 0.9630701 0.9298195
6  918368.0  577355.6  747861.8         NA         NA       NA       NA  1013708.2   571826.2   824330.2          NA          NA        NA        NA 0.9059491 1.0096698 0.9072357

[[4]]

  Meve_mean Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm diff_eve diff_mor diff_tot
4  3866.547   23511.3  13688.92         NA         NA       NA       NA   3422.172   11541.89    6902.05          NA          NA        NA        NA 1.129852 2.037041 1.983313

[[5]]
   Meve_mean  Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm  Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm  diff_eve diff_mor  diff_tot
1   7774.442   39113.53  23443.99  258492587  246665241 35569170 30522100   10752.438   29254.64   18681.95   278901500   235415567  16743477   3922418 0.7230399 1.337003 1.2549002
2  43374.117  119476.16  81425.14         NA         NA       NA       NA   53658.844   98804.80   73007.11          NA          NA        NA        NA 0.8083312 1.209214 1.1153042
3 585562.724 1296530.34 941046.53         NA         NA       NA       NA  715547.922 1001344.20  838032.04          NA          NA        NA        NA 0.8183417 1.294790 1.1229243
4   3866.547   23511.30  13688.92         NA         NA       NA       NA    3422.172   11541.89    6902.05          NA          NA        NA        NA 1.1298517 2.037041 1.9833127
6 918368.020  577355.58 747861.80         NA         NA       NA       NA 1013708.188  571826.16  824330.18          NA          NA        NA        NA 0.9059491 1.009670 0.9072357

[[6]]
   Meve_mean  Mmor_mean Mtot_mean tot_meanMe tot_meanMm tot_sdMe tot_sdMm Wteve_mean Wtmor_mean Wttot_mean tot_meanwte tot_meanwtm tot_sdwte tot_sdwtm  diff_eve diff_mor diff_tot
1   7774.442   39113.53  23443.99  258492587  246665241 35569170 30522100  10752.438   29254.64   18681.95   278901500   235415567  16743477   3922418 0.7230399 1.337003 1.254900
2  43374.117  119476.16  81425.14         NA         NA       NA       NA  53658.844   98804.80   73007.11          NA          NA        NA        NA 0.8083312 1.209214 1.115304
3 585562.724 1296530.34 941046.53         NA         NA       NA       NA 715547.922 1001344.20  838032.04          NA          NA        NA        NA 0.8183417 1.294790 1.122924
4   3866.547   23511.30  13688.92         NA         NA       NA       NA   3422.172   11541.89    6902.05          NA          NA        NA        NA 1.1298517 2.037041 1.983313

如果您想要数据框前面的选择标准列,如您的示例所示,您可以使用:

c(lapply(cols, function(x)subset(tbl_comp[,c(cols, setdiff(colnames(tbl_comp), cols))], eval(parse(text=x))<1)), lapply(cols, function(x)subset(tbl_comp[,c(cols, setdiff(colnames(tbl_comp), cols))], eval(parse(text=x))>1)))
于 2013-11-07T15:18:39.237 回答
1

我不知道我是否正确理解了您的问题,但这里可能是一个解决方案:

diff_eve_below <- subset(tbl_comp, diff_eve < 1)
diff_eve_above <- subset(tbl_comp, diff_eve > 1)
diff_mor_below <- subset(tbl_comp, diff_mor < 1)
diff_mor_above <- subset(tbl_comp, diff_mor > 1)
diff_tot_below <- subset(tbl_comp, diff_tot < 1)
diff_tot_above <- subset(tbl_comp, diff_tot > 1)
于 2013-11-07T14:46:44.167 回答