0

I am having an issue with nesting and mapping that I am not sure how to get around. I have a tibble with nested dataframes, as follows:

> x
# A tibble: 18 × 3
   event.no               data dr.dur
      <dbl>             <list>  <int>
1         1   <tibble [7 × 4]>      7
2         4 <tibble [123 × 4]>    123
3         5   <tibble [9 × 4]>      9
4         7  <tibble [14 × 4]>     14
5        10  <tibble [19 × 4]>     19
6        11 <tibble [220 × 4]>    220
7        12 <tibble [253 × 4]>    253
8        14 <tibble [153 × 4]>    153
9        15  <tibble [28 × 4]>     28
10       17 <tibble [169 × 4]>    169
11       18   <tibble [7 × 4]>      7
12       19 <tibble [115 × 4]>    115
13       21 <tibble [109 × 4]>    109
14       25  <tibble [13 × 4]>     13
15       26 <tibble [249 × 4]>    249
16       28   <tibble [7 × 4]>      7
17       30  <tibble [26 × 4]>     26
18       31  <tibble [12 × 4]>     12
>
> x$data[[1]]
# A tibble: 7 × 4
  discharge threshold def.increase event.orig
      <dbl>     <dbl>        <dbl>      <dbl>
1     0.348     0.373       2160.0          1
2     0.348     0.373       2160.0          1
3     0.379     0.373       -518.4          0
4     0.379     0.373       -518.4          0
5     0.379     0.373       -518.4          0
6     0.379     0.373       -518.4          0
7     0.348     0.373       2160.0          2
> 

I need to find the sum of the def.increase column in each of the nested dataframes. I'm not sure of the best method to do this right now, this is what I've been trying:

> x %>%
+   mutate(dr.def = map(data, colSums)) %>%
+   unnest(dr.def)
# A tibble: 72 × 3
   event.no dr.dur    dr.def
      <dbl>  <int>     <dbl>
1         1      7     2.560
2         1      7     2.611
3         1      7  4406.400
4         1      7     4.000
5         4    123    45.739
6         4    123    45.879
7         4    123 12096.000
8         4    123   530.000
9         5      9     3.269
10        5      9     3.357
# ... with 62 more rows

Obviously the issue with this is that I end up with the sum from every column. This would be okay but it gets quite messy afterwards to select only the rows that I want. Is there a better way of finding the column sum for each of my def.increase columns? Thanks for your help :)

Edit: Not sure if I can copy/paste an object like my x so here is a link to the rds on wetransfer (if that's allowed): https://wetransfer.com/downloads/9697fff593f51c02136bc704adccbcc220170112161115/5be1fc

4

1 回答 1

4

您只需def.increase要先选择列:

library(tidyverse)

x %>% 
  mutate(dr.def = map(data, "def.increase") %>% map_dbl(sum))

或者只用一张地图:

x %>% 
  mutate(dr.def = map_dbl(data, ~ sum(.x[["def.increase"]])))
于 2017-01-12T16:19:10.007 回答