0

我正在尝试绘制主题上的单词分布网络(主题关系)。使用此代码[来源]

post <- topicmodels::posterior(ldaOut)

cor_mat <- cor(t(post[["terms"]]))
cor_mat[ cor_mat < .05 ] <- 0
diag(cor_mat) <- 0

graph <- graph.adjacency(cor_mat, weighted=TRUE, mode="lower")
graph <- delete.edges(graph, E(graph)[ weight < 0.05])

E(graph)$edge.width <- E(graph)$weight*20
V(graph)$label <- paste("Topic", V(graph))
V(graph)$size <- colSums(post[["topics"]]) * 15

par(mar=c(0, 0, 3, 0))
set.seed(110)
plot.igraph(graph, edge.width = E(graph)$edge.width, 
    edge.color = "orange", vertex.color = "orange", 
    vertex.frame.color = NA, vertex.label.color = "grey30")
title("Strength Between Topics Based On Word Probabilities", cex.main=.8)

数据样本cor_mat

          1          2          3          4          5          6          7       ...
1  0.00000000 0.00000000 0.00000000 0.09612831 0.00000000 0.17248020 0.00000000
2  0.00000000 0.00000000 0.07206496 0.00000000 0.00000000 0.05755187 0.00000000
3  0.00000000 0.07206496 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
4  0.09612831 0.00000000 0.00000000 0.00000000 0.08459681 0.00000000 0.06895900
5  0.00000000 0.00000000 0.00000000 0.08459681 0.00000000 0.00000000 0.00000000
6  0.17248020 0.05755187 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
7  0.00000000 0.00000000 0.00000000 0.06895900 0.00000000 0.00000000 0.00000000
8  0.00000000 0.00000000 0.00000000 0.00000000 0.54849308 0.00000000 0.00000000
9  0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.09745720 0.00000000
10 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
11 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
12 0.00000000 0.00000000 0.00000000 0.10329825 0.00000000 0.14057310 0.00000000
13 0.14664201 0.00000000 0.00000000 0.00000000 0.05803984 0.00000000 0.00000000
14 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
15 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
16 0.00000000 0.00000000 0.10290656 0.00000000 0.00000000 0.00000000 0.06293238
17 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
18 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
19 0.00000000 0.00000000 0.00000000 0.00000000 0.33483481 0.00000000 0.00000000
20 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
21 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
22 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.27720724 0.00000000
23 0.12487435 0.14806837 0.00000000 0.10355990 0.00000000 0.05086977 0.00000000
24 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.06622769 0.00000000

不幸的是,情节是这样的: 在此处输入图像描述

关于如何使主题网络更优雅,显示主题之间的链接而不是使它们相互重叠的任何想法?

4

1 回答 1

0

简单的解决方案是将数字 w 更改为eight*20colSums(post[["topics"]])*15小的数字以避免重叠问题。代码可能是这样的

...    
E(graph)$edge.width <- E(graph)$weight* 5
V(graph)$label <- paste("Topic", V(graph))
V(graph)$size <- colSums(post[["topics"]]) * 2
...

结果, 在此处输入图像描述

于 2017-09-06T20:04:17.067 回答