因此,我通过计算来自 10 倍训练和测试集的困惑度来遵循 Grun 和 Hornik ( http://www.jstatsoft.org/v40/i13/ ) 的 10 倍交叉验证方法。但是当我创建 test_gibbs 时出现错误,这在下面的代码末尾说明。有人可以建议如何解决这个问题吗?提前致谢。
R> dim(dtm)
[1] 546 1484
R> fold <- 1
R> range(col_sums(dtm))
[1] 1 192
R> set.seed(0908)
R> folding <-
+ sample(rep(seq_len(10),
+ ceiling(nrow(dtm)))[seq_len(nrow(dtm))])
R> testing <- which(folding == fold)
R> training <- which(folding != fold)
R> topics <- 10 * c(1:5, 10, 20)
R> train <- LDA(dtm[training,], k = k,
+ control = list(verbose = 100))
final e step document 491
R> test <- LDA(dtm[testing,], model = train,
+ control = list(estimate.beta = FALSE))
R> train_gibbs <- LDA(dtm[training,], k = k, method = "Gibbs",
+ control = list(burnin = 1000, thin = 100,
+ iter = 1000, best = FALSE))
R> # this is where the error occurs################
R> test_gibbs <- LDA(dtm[testing,],
+ model = train_gibbs[[which.max(sapply, train_gibbs, logLik)]],
+ control = list(estimate.beta = FALSE, burnin = 1000,
+ thin = 100, iter = 1000, best = FALSE))
错误 where.max(sapply, train_gibbs, logLik) : 未使用的参数 (train_gibbs, logLik)