我正在尝试通过LDA 实现从tm
-package运行 AssociatedPress 数据集。text2vec
我面临的问题是数据类型的不兼容:AssociatedPress
is a tm::DocumentTermMatrix
which 又是slam::simple_triplet_matrix
. text2vec
但是期望输入x
为.text2vec::lda$fit_transform(x = ...)
Matrix::dgTMatrix
因此,我的问题是:有没有办法强迫DocumentTermMatrix
接受的东西text2vec
?
最小(失败)示例:
library('tm')
library('text2vec')
data("AssociatedPress", package="topicmodels")
dtm <- AssociatedPress[1:10, ]
lda_model = LDA$new(
n_topics = 10,
doc_topic_prior = 0.1,
topic_word_prior = 0.01
)
doc_topic_distr =
lda_model$fit_transform(
x = dtm,
n_iter = 1000,
convergence_tol = 0.001,
n_check_convergence = 25,
progressbar = FALSE
)
...这使:
base::rowSums(x, na.rm = na.rm, dims = dims, ...) : 'x' 必须是至少二维的数组