尝试在 dplyr 中创建 rowSums 时遇到问题。
通过分组数据后
data <- data %>%
group_by(location, category) %>%
summarise(amount = sum(amount)) %>%
spread(key = "category", value = "amount", fill = 0)
输出是:
# A tibble: 4,211 x 140
# Groups: location [4,211]
location art books cars
* <chr> <dbl> <dbl> <dbl>
1 New York, NY 0 10 0
2 Los Angeles, CA 12 0 2
...
然后尝试使 rowSum 不起作用:
data %>% mutate(sum=rowSums(.))
Error in mutate_impl(.data, dots) :
Evaluation error: 'x' must be numeric.
> class(ks)
[1] "grouped_df" "tbl_df" "tbl" "data.frame"
我试图改变如下所示的枢轴,但它也没有帮助:
data <- data %>%
group_by(location, category) %>%
summarise(amount = as.numeric(sum(amount))) %>% # Changed
spread(key = "category", value = "amount", fill = 0)
str(data.frame(data))
'data.frame': 4211 obs. of 140 variables:
$ location : chr "New York, NY" "Los Angeles, CA" ... ...
$ art : num 0 0 0 0 0 0 0 0 0 0 ...
$ books : num 0 0 0 0 0 0 0 0 0 0 ...
$ cars : num 0 0 0 0 0 0 0 0 0 0 ...
...
在这里能得到一些帮助会很棒。
在计算每行的总和后,我需要过滤行和 < 1000 的位置。知道如何执行此操作以及dplyr
通常是否是正确的方法也很棒。