0

我正在使用 R Studio 分析一项调查。我正在使用 tidytext 包中的 Bing Sentiment 词典来执行此操作。

有些词对我的调查没有正确的含义,特别是“温柔”被编码为积极,但我的受访者将“温柔”表示为消极(疼痛)。我知道如何从 bing tibble 中删除一个单词,然后添加一个新单词,但是我怎样才能简单地更改单词的含义呢?

例如:

structure(list(word = c("pain", "tender", "sensitive", "headaches", 
"like", "anxiety"), sentiment = c("negative", "positive", "positive", 
"negative", "positive", "negative"), n = c(351L, 305L, 279L, 
220L, 200L, 196L)), row.names = c(NA, 6L), class = "data.frame")

我希望它看起来像:

structure(list(word = c("pain", "tender", "sensitive", "headaches", 
"like", "anxiety"), sentiment = c("negative", "negative", "positive", 
"negative", "positive", "negative"), n = c(351L, 305L, 279L, 
220L, 200L, 196L)), row.names = c(NA, 6L), class = "data.frame")

谢谢!

4

2 回答 2

2

跑线

df$sentiment <- ifelse(df$word == "tender", "positive", df$sentiment)

将有效地更改sentiment向量“温柔”的任何实例的word向量,使其显示为“正”。任何其他实例将保持原样。

请注意,如果您还想将其他词更改为积极的情绪,您可以执行以下操作:

df$sentiment <- ifelse(df$word %in% c("tender", "anotherword", "etc"), "positive", df$sentiment)
于 2020-08-05T14:57:57.440 回答
2

tidyverse在(在其上构建)中进行这种重新编码的tidytext方法通常是:

library(tidyverse)
  
df %>% 
  mutate(sentiment = case_when(
    word == "tender" ~ "negative",
    TRUE ~ sentiment # means leave if none of the conditions are met
  ))
#>        word sentiment   n
#> 1      pain  negative 351
#> 2    tender  negative 305
#> 3 sensitive  positive 279
#> 4 headaches  negative 220
#> 5      like  positive 200
#> 6   anxiety  negative 196

case_when遵循ifelse与 左侧~评估条件,右侧说明满足条件时的值。你可以设置一个默认值,如里面最后一行所示case_when

于 2020-08-05T16:29:09.407 回答