0

对于 lda 主题建模,我想添加一个散点图,它在地图上绘制出主题之间的关系。我知道该LDAvis函数能够做到这一点,但如果可能,我想使用替代方法,因为我的实际数据集可能包含许多主题,这不允许LDAvis应用程序。

在以下网站https://towardsdatascience.com/visualizing-topic-models-with-scatterpies-and-t-sne-f21f228f7b02我找到了该网站上的代码并尝试对其进行调整,但无法使其正常工作. 我需要一个图来创建散点图或类似的东西来进行分析。

我尝试使用我的改编的代码:

top_terms %>%
    group_by(topic) %>%
    ggplot() + 
    geom_scatterpie(aes(top_terms, beta, fill = factor(topic)), color=NA, alpha=0.7) + 
    coord_equal() + 
    geom_label() + 
    ggtitle(Scatterpie_Graph) + 
    xlab() + ylab() + labs(subtitle=t-SNE_Representation_of_Guided_LDA_Topics_Colored_and_Sized_by_Topic_Probability) +
    scale_fill_manual(values=colors) + 
    theme_minimal() + 
    theme(text = element_text(color=white),
          legend.position = none,
          panel.background = element_rect(fill = gray17, colour = gray17), 
          plot.background = element_rect(fill = gray17),
          panel.grid.major = element_line(colour = gray25),
          panel.grid.minor = element_line(colour = gray25),
          axis.text = element_text(color=white))

虚拟数据集顶级术语:

top_terms_struct <- structure(
  list(
    topic = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
              4, 4, 4, 4),
    term = c("book", "page", "chapter", "section", "sports", "soccer", "champions", "league",
      "music", "song", "dj", "release", "movie", "cinema", "actress", "story"),
    beta = c(0.9876, 0.9765, 0.9654, 0.9543,  0.8765, 0.8654, 0.8543, 0.8432, 0.9543, 0.8678,
      0.7231, 0.6382, 0.9846, 0.9647, 0.8878, 0.6523)),
  row.names = c(NA,-16L),
  class = c("tbl_df", "tbl", "data.frame")) 

所需的输出应该是散点图或类似的东西,它映射的主题与所做的相似LDAvis,但最好使用不同的技术。但我对替代品持开放态度。

提前致谢。

4

0 回答 0