Find centralized, trusted content and collaborate around the technologies you use most.
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
如何使用 text2vec 包创建具有字符 n-gram 特征的 tdf-idf 矩阵?
怎么样:
library(text2vec) data("movie_review") it = itoken(movie_review$review, tolower, char_tokenizer) v = create_vocabulary(it, ngram = c(3, 3), sep_ngram = "_") dtm = create_dtm(it, vectorizer = vocab_vectorizer(v))
PS 将来请尝试提供一些可重现的示例,说明您尝试解决的问题。