r - Quanteda：具有预定义特征集的文档特征矩阵

Question

我正在使用 quanteda 构建两个文档特征矩阵：

library(quanteda)
DFM1 <- dfm("this is a rock")
#        features
# docs    this is a rock
#   text1    1  1 1    1
DFM2 <- dfm("this is music")
#        features
# docs    this is music
#   text1    1  1     1

但是，我希望 DFM2 具有一组特定的功能，即来自 DFM1 的功能：

DFM2 <- dfm("this is music", *magicargument* = featnames(DFM1))
#        features
# docs    this is a rock
#   text1    1  1 0    0

有没有我想念的魔法论据？还是有另一种有效的方法来为大袋的词存档？

score 2 · Accepted Answer

神奇的论点是pattern，您提供一个 dfm，其特征将被匹配（包括目标 dfm 中不存在的特征的零）：

dfm_select(DFM2, pattern = DFM1)
# Document-feature matrix of: 1 document, 4 features (50% sparse).
# 1 x 4 sparse Matrix of class "dfmSparse"
#        features
# docs    this is a rock
#   text1    1  1 0    0

r - Quanteda：具有预定义特征集的文档特征矩阵

1 回答 1

Related

Reference