I don't understand this part of output from lda.collapsed.gibbs.sampler command. What I don't understand is why the numbers of the same word in different topics are different? For example, why for the word "test" there is 4 of them in second topics when topic 8 get 37 of them. Shouldn't number of same word in different topic be the same integer or 0?
Or Do I misunderstood something and these numbers don't stand for number of word in the topic?
$topics
tests-loc fail test testmultisendcookieget
[1,] 0 0 0 0
[2,] 0 0 4 0
[3,] 0 0 0 0
[4,] 0 1 0 0
[5,] 0 0 0 0
[6,] 0 0 0 0
[7,] 0 0 0 0
[8,] 0 0 37 0
[9,] 0 0 0 0
[10,] 0 0 0 0
[11,] 0 0 0 0
[12,] 0 2 0 0
[13,] 0 0 0 0
[14,] 0 0 0 0
[15,] 0 0 0 0
[16,] 0 0 0 0
[17,] 0 0 0 0
[18,] 0 0 0 0
[19,] 0 0 0 0
[20,] 0 0 0 0
[21,] 0 0 0 0
[22,] 0 361 1000 0
[23,] 0 0 0 0
[24,] 0 0 0 0
[25,] 0 0 0 0
[26,] 0 0 0 0
[27,] 0 0 0 0
[28,] 0 1904 12617 0
[29,] 0 0 0 0
[30,] 0 0 0 0
[31,] 0 0 0 0
[32,] 0 1255 3158 0
[33,] 0 0 0 0
[34,] 0 0 0 0
[35,] 0 0 0 0
[36,] 1 0 0 1
[37,] 0 1 0 0
[38,] 0 0 0 0
[39,] 0 0 0 0
[40,] 0 0 0 0
[41,] 0 0 0 0
[42,] 0 0 0 0
[43,] 0 0 0 0
[44,] 0 0 0 0
[45,] 0 2 0 0
[46,] 0 0 0 0
[47,] 0 0 0 0
[48,] 0 0 4 0
[49,] 0 0 0 0
[50,] 0 1 0 0
Here is the code that I run.
library(lda)
data=read.documents(filename = "data.ldac")
vocab=read.vocab(filename = "words.csv")
K=100
num.iterations=100
alpha=1
eta=1
result = lda.collapsed.gibbs.sampler(data, K,vocab, num.iterations, alpha,eta, initial = NULL, burnin = NULL, compute.log.likelihood = FALSE,trace = 0L, freeze.topics = FALSE)
options(max.print=100000000)
result
PS. Sorry for the long post and my bad english.