1

我有具有 1000 个顶点的二分列表(帖子、单词类别),并且想使用快速和贪婪的算法进行社区检测,但我不确定是否必须在二分图或二分投影上运行它。

我的二分列表如下所示:

   post word
1   66  2
2   312 1
3   432 7
4   433 7
5   434 1
6   435 5
7   436 1
8   437 4

当我在没有投影的情况下运行它时,我在第二步中遇到了聚类问题:

### Load bipartie list and create graph ###
bipartite_list <- read.csv("bipartite_list_tnf.csv", header = TRUE, sep = ";")
bipartite_graph <- graph.incidence(bipartite_list)
g<-bipartite_graph
fc <- fastgreedy.community(g) ## communities / clusters
set.seed(123)
l <- layout.fruchterman.reingold(g, niter=1000, coolexp=0.5) ## layout
membership(fc)
# 2. checking who is in each cluster
cl <- data.frame(name = fc$post, cluster = fc$membership, stringsAsFactors=F)
cl <- cl[order(cl$cluster),]
cl[cl$cluster==1,]

# 3. preparing data for plot
d <- data.frame(l); names(d) <- c("x", "y")
d$cluster <- factor(fc$membership)

# 4. plot with only nodes, colored by cluster
p <- ggplot(d, aes(x=x, y=y, color=cluster))
pq <- p + geom_point()
pq

也许我必须在投影上运行社区检测?但是后来我总是失败,因为投影不是图形对象:

bipartite_graph <- graph.incidence(bipartite_list)
#projection (both directions)
projection_word_post <- bipartite.projection(bipartite_graph)
fc <- fastgreedy.community(projection_word_post)
Fehler in fastgreedy.community(projection_word_post) : Not a graph object

我很乐意提供帮助!

4

1 回答 1

1

当您在没有投影的情况下运行时,问题出在:

bipartite_graph <- graph.incidence(bipartite_list)

graph.incidence()在应用到函数之前,您需要重塑“bipartite_list” 。使用以下命令

tab <- table(bipartite_list)

其余步骤相同

g <- graph.incidence(tab,mode=c("all"))
fc <- fastgreedy.community(g)
于 2016-08-03T21:38:11.643 回答