我正在寻找类似于https://rdrr.io/github/bfgray3/cattonum/man/catto_freq.html的功能
但实现为 recipes::step_-function ( https://tidymodels.github.io/recipes/reference/index.html )
有人知道这个的实现吗?:)
我正在寻找类似于https://rdrr.io/github/bfgray3/cattonum/man/catto_freq.html的功能
但实现为 recipes::step_-function ( https://tidymodels.github.io/recipes/reference/index.html )
有人知道这个的实现吗?:)
您可以使用recipes 包中的step_count。
这里有一个使用 Titanic 数据集和 Embarked 变量的示例:
> library(tidymodels)
> titanic <-
readr::read_csv(
"https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv",
col_types = readr::cols()
)
> juiced <- recipe(Survived ~ ., data = titanic) %>%
step_count(Embarked) %>%
prep %>% juice %>% glimpse
Rows: 891
Columns: 13
$ PassengerId <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, …
$ Pclass <dbl> 3, 1, 3, 1, 3, 3, 1, 3, 3, 2, 3, 1, 3, 3, 3, 2, 3, 2, 3, 3, 2, 2, 3, 1, 3, 3, 3…
$ Name <fct> "Braund, Mr. Owen Harris", "Cumings, Mrs. John Bradley (Florence Briggs Thayer)"…
$ Sex <fct> male, female, female, female, male, male, male, male, female, female, female, f…
$ Age <dbl> 22, 38, 26, 35, 35, NA, 54, 2, 27, 14, 4, 58, 20, 39, 14, 55, 2, NA, 31, NA, 35…
$ SibSp <dbl> 1, 1, 0, 1, 0, 0, 0, 3, 0, 1, 1, 0, 0, 1, 0, 0, 4, 0, 1, 0, 0, 0, 0, 0, 3, 1, 0…
$ Parch <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 1, 0, 0, 5, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 5, 0…
$ Ticket <fct> A/5 21171, PC 17599, STON/O2. 3101282, 113803, 373450, 330877, 17463, 349909, 3…
$ Fare <dbl> 7.2500, 71.2833, 7.9250, 53.1000, 8.0500, 8.4583, 51.8625, 21.0750, 11.1333, 30…
$ Cabin <fct> NA, C85, NA, C123, NA, NA, E46, NA, NA, NA, G6, C103, NA, NA, NA, NA, NA, NA, N…
$ Embarked <fct> S, C, S, S, S, Q, S, S, S, C, S, S, S, S, S, S, Q, S, S, C, S, S, Q, S, S, S, C…
$ Survived <dbl> 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0…
> juiced %>% count(Embarked)
# A tibble: 4 × 2
Embarked n
<fct> <int>
1 C 168
2 Q 77
3 S 644
4 NA 2