After running an STM model based on a Quanteda dfm, I want to estimate my covariates' effects on certain topics.
Running the STM model went fine, producing the topics as expected, but when using estimateEffect
(in the final step in the script below) the R session is aborted, notifying there is a 'fatal error'.
How can I estimate my covariates' effects, when starting from a dfm? The STM manual advices on running an STM model from a dfm, but I couldn't find how to work with the covariates after this stage.
Here's the code:
# Read texts with Quanteda
texts <- (readtext("C:/Users/renswilderom/Documents/Stuff Im working on at the moment/Newspaper articles DJ/test data/*.txt",
docvarsfrom = "filenames", dvsep = "_",
docvarnames = c("Date of Publication", "Length LexisNexis", "source"),
encoding = "UTF-8-BOM"))
mycorpus <- corpus(texts)
tokens <- tokens(mycorpus, remove_punct = TRUE, remove_numbers = TRUE, ngrams = 1)
mydfm <- dfm(tokens, remove = stopwords("english"), stem = TRUE)
# Run the STM model - Metadata is called with 'data = docvars(mycorpus)'
stm_from_dfm <- stm(mydfm, K = 10, prevalence =~ Date.of.Publication + source, gamma.prior='L1', data = docvars(mycorpus))
# Estimate effects
prep <- estimateEffect(1:10 ~ Date.of.Publication + source, stm_from_dfm,
meta = docvars(mycorpus), uncertainty = "Global")
Alternatively, I made an STM corpus from my dfm corpus, using STMcorpus <- asSTMCorpus(mydfm)
. But then I couldn't run the STM model as it didn't recognized my meta data. Would it be better to follow this alternative strategy? (so I need to associate the meta data with the STMcorpus in some way after running STMcorpus <- asSTMCorpus(mydfm)
).