在关注评论后,现在已修复。
我正在关注这里给出的教程 - https://www.tidytextmining.com/ngrams.html。
我想要做的是创建一个 CSV 文件中存在的评论文本的二元网络图。
这是数据集的链接 - https://app.box.com/s/y6nsmji4ir7nbggmbncnhf21xla96nml。
这是代码:
library(dplyr)
library(tidyr)
library(tidytext)
library(ggplot2)
library(igraph)
library(ggraph)
library(stringr)
kjv <- read.csv(file.choose())
count_bigrams <- function(dataset) {
dataset %>%
unnest_tokens(bigram, commentText, token = "ngrams", n = 2) %>%
separate(bigram, c("word1", "word2"), sep = " ") %>%
filter(!word1 %in% stop_words$word,
!word2 %in% stop_words$word) %>%
count(word1, word2, sort = TRUE)
}
visualize_bigrams <- function(bigrams) {
set.seed(2016)
a <- grid::arrow(type = "closed", length = unit(.15, "inches"))
bigrams %>%
graph_from_data_frame() %>%
ggraph(layout = "fr") +
geom_edge_link(aes(edge_alpha = n), show.legend = FALSE, arrow = a) +
geom_node_point(color = "lightblue", size = 5) +
geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
theme_void()
}
kjv_bigrams <- kjv %>%
count_bigrams()
在这里,我收到以下错误:
Error in summarise_impl(.data, dots) :
Evaluation error: argument
...should be a character vector (or an object coercible to).
这是数据集的样子:
感谢您的阅读!