感谢很多人,我的图表工作是 R 的新手。
I have three charts
绘图频率排序
绘制帕累托叠加
```{r}
df <- filter(df_clean_distances, end_station_name != "NA" )
d <-df %>% select( end_station_name) %>%
group_by(end_station_name) %>%
summarize( freq = n())
head(d$freq )
dput(head(d))
d2 <- d[ order(-d$freq),]
d2
随机绘制
```{r}
ggplot(d2, aes( x=end_station_name, y= freq)) +
geom_bar( stat = "identity") +
theme( axis.text.x = element_blank()) +
ylim( c(0,40000))
```
绘图频率排序
```{r}
ggplot(d2, aes( x=reorder(end_station_name,-freq), y= freq)) +
geom_bar( stat = "identity") +
theme(axis.text.x = element_blank()) +
ylim( c(0,40000))+
labs( title = "end station by freq", x = "Station Name")
使用 Pareto 叠加绘图
```{r}
ggplot(d2, aes( x=reorder(end_station_name,-freq), y= freq)) +
geom_bar( stat = "identity") + theme(axis.text.x = element_blank()) +
ggQC::stat_pareto( point.color = "red", point.size = 0.5) +
labs( title = "end station by freq", x = "Station Name")
```
输入(头)输出
```{r}
> dput(head(d, n=20))
structure(list(end_station_name = c("2112 W Peterson Ave", "63rd St
Beach",
"900 W Harrison St", "Aberdeen St & Jackson Blvd", "Aberdeen St &
Monroe St",
"Aberdeen St & Randolph St", "Ada St & 113th St", "Ada St &
Washington Blvd",
"Adler Planetarium", "Albany Ave & 26th St", "Albany Ave &
Bloomingdale Ave",
"Albany Ave & Montrose Ave", "Archer (Damen) Ave & 37th St",
"Artesian Ave & Hubbard St", "Ashland Ave & 13th St", "Ashland Ave &
50th St",
"Ashland Ave & 63rd St", "Ashland Ave & 66th St", "Ashland Ave &
69th St",
"Ashland Ave & 73rd St"), freq = c(1032L, 2524L, 3836L, 8383L,
6587L, 6136L, 18L, 6281L, 12050L, 397L, 2833L, 1875L, 710L, 1879L,
2659L, 151L, 112L, 102L, 78L, 8L)), row.names = c(NA, -20L), class =
c("tbl_df", "tbl", "data.frame"))
```
如您所见,帕累托图适用于右手比例,但左手非常不合时宜。虽然有 300 万行,但 y 轴上的缩放已将频率降低到底部的一条非常小的曲线,但左侧很难看到。
如何将左 y 轴固定为大约 40,000,以便正确显示频率曲线?