对于 lda 主题建模,我想添加一个散点图,它在地图上绘制出主题之间的关系。我知道该LDAvis
函数能够做到这一点,但如果可能,我想使用替代方法,因为我的实际数据集可能包含许多主题,这不允许LDAvis
应用程序。
在以下网站https://towardsdatascience.com/visualizing-topic-models-with-scatterpies-and-t-sne-f21f228f7b02我找到了该网站上的代码并尝试对其进行调整,但无法使其正常工作. 我需要一个图来创建散点图或类似的东西来进行分析。
我尝试使用我的改编的代码:
top_terms %>%
group_by(topic) %>%
ggplot() +
geom_scatterpie(aes(top_terms, beta, fill = factor(topic)), color=NA, alpha=0.7) +
coord_equal() +
geom_label() +
ggtitle(Scatterpie_Graph) +
xlab() + ylab() + labs(subtitle=t-SNE_Representation_of_Guided_LDA_Topics_Colored_and_Sized_by_Topic_Probability) +
scale_fill_manual(values=colors) +
theme_minimal() +
theme(text = element_text(color=white),
legend.position = none,
panel.background = element_rect(fill = gray17, colour = gray17),
plot.background = element_rect(fill = gray17),
panel.grid.major = element_line(colour = gray25),
panel.grid.minor = element_line(colour = gray25),
axis.text = element_text(color=white))
虚拟数据集顶级术语:
top_terms_struct <- structure(
list(
topic = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
4, 4, 4, 4),
term = c("book", "page", "chapter", "section", "sports", "soccer", "champions", "league",
"music", "song", "dj", "release", "movie", "cinema", "actress", "story"),
beta = c(0.9876, 0.9765, 0.9654, 0.9543, 0.8765, 0.8654, 0.8543, 0.8432, 0.9543, 0.8678,
0.7231, 0.6382, 0.9846, 0.9647, 0.8878, 0.6523)),
row.names = c(NA,-16L),
class = c("tbl_df", "tbl", "data.frame"))
所需的输出应该是散点图或类似的东西,它映射的主题与所做的相似LDAvis
,但最好使用不同的技术。但我对替代品持开放态度。
提前致谢。