0

对于上周的TidyTuesday挑战,我想制作一个流图,描述 1990 年至 2022 年间排名前 5 位的棋盘游戏类别的平均平均值。为此,我做了一些数据整理,如下所示

ratings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-25/ratings.csv')
details <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-25/details.csv')

board_games <-
  ratings %>%
  left_join(details, by = "id")

board_games$boardgamecategory <- substring(board_games$boardgamecategory,3,nchar(board_games$boardgamecategory)-2)
board_games$boardgamecategory <- str_replace_all(board_games$boardgamecategory, c("'" = ""))
splitted_data <-separate(board_games, col = boardgamecategory, 
                          into = c("categories1","categories2","categories3",
                                   "categories4","categories5","categories6",
                                   "categories7","categories8","categories9",
                                   "categories10","categories11","categories12",
                                   "categories13","categories14"), sep=",") 

top_categories <- splitted_data %>%  
  pivot_longer(cols = categories1:categories14, names_to = "topcategories", values_to = "categoriestype", values_drop_na = TRUE) %>%
  select(-c(topcategories)) %>%
  group_by(categoriestype) %>%
  summarise(count = n()) %>%
  arrange(desc(count))

top_categories_data <- splitted_data %>%
  pivot_longer(cols = categories1:categories14, names_to = "topcategories", values_to = "categoriestype", values_drop_na = TRUE) %>%
  select(-c(topcategories)) %>%
  filter(categoriestype %in% c("Card Game", " Wargame", " Fantasy", " Party Game", "Abstract Strategy")) %>%
  select(categoriestype, average, yearpublished) %>%
  group_by(yearpublished, categoriestype) %>%
  mutate(mean_average = mean(average)) %>%
  select(-c(average)) %>%
  distinct(categoriestype, .keep_all = TRUE) %>%
  as.data.frame() %>%
  filter(yearpublished > 1989) %>%
  arrange(desc(yearpublished), categoriestype)

top_categories_data$categoriestype <- trimws(top_categories_data$categoriestype)
top_categories_data$mean_average <- round(top_categories_data$mean_average, 2)

由于我的数据清理,如上所示,我的数据的最终形状是这样的

categoriestype yearpublished mean_average
1             Fantasy          2022         7.81
2          Party Game          2022         7.86
3             Wargame          2022         8.27
4   Abstract Strategy          2022         8.12
5           Card Game          2022         7.81
6             Fantasy          2021         7.66
7          Party Game          2021         7.03
8             Wargame          2021         8.13
9   Abstract Strategy          2021         7.00
10          Card Game          2021         7.27

现在,当我尝试使用以下代码绘制流图时

pp <- streamgraph(top_categories_data, key="categoriestype", value="mean_average", date="yearpublished", 
                  height="300px", width="1000px")

情节有点荒谬,如下所示

在此处输入图像描述

我无法理解问题出在哪里或为什么要以这种形状绘制图表。因此,如果您能帮助我,我将不胜感激。

4

0 回答 0