0

我有一个关于 D&D 角色的数据集,看起来像这样

Race   Class              Level   AC
Human  Fighter | Wizard    10     15
Elf    Wizard              8      10
Human  Rogue               6      12
Dwarf  Barbarian           15     18

我想分离由“|”指示的多类分类的类 此外,如果一个角色没有兼职,我想在那个插槽中放置一个“NA”或“None”

Race   Primary_Class      Level   AC    Subclass   Multiclass
Human  Fighter             10     15    Wizard         1
Elf    Wizard              8      10    NA             0
Human  Rogue               6      12    NA             0
Dwarf  Barbarian           15     18    NA             0

有没有一种干净的方法可以做到这一点?

4

3 回答 3

1

我们可以使用sub删除之后的所有内容"|"str_extract提取之后的所有内容"|"并用于str_detect检测"|"数据中是否存在。

library(dplyr)
library(stringr)

df %>%
 mutate(Primary_Class = trimws(sub('\\|.*',  '', Class)), 
        Subclass = str_extract(Class, "(?<=\\|).*"), 
        Multiclass = +(str_detect(Class, "\\|"))) %>%
 select(-Class)

#   Race Level AC Primary_Class Subclass Multiclass
#1 Human    10 15      Fighter   Wizard          1
#2   Elf     8 10       Wizard     <NA>          0
#3 Human     6 12        Rogue     <NA>          0
#4 Dwarf    15 18    Barbarian     <NA>          0
于 2020-03-15T23:59:18.630 回答
1

您可以使用三个ifelse子句grepl以及使用\\1和的反向引用来\\2分别匹配有问题的模式并gsub操纵匹配:

df1$Primary_class <- ifelse(grepl("\\|", df1$Class), 
                            gsub("([A-z]+)\\s\\|\\s([A-z]+)", "\\1", df1$Class), df1$Class)

df1$Subclass <- ifelse(grepl("\\|", df1$Class), 
                            gsub("([A-z]+)\\s\\|\\s([A-z]+)", "\\2", df1$Class), "NA")

df1$Multiclass <- ifelse(grepl("\\|", df1$Class), 1, 0)

df1
   Race            Class Level AC Primary_class Multiclass Sub_class
1 Human Fighter | Wizard    10 15       Fighter          1    Wizard
2   Elf           Wizard     8 10        Wizard          0        NA
3 Human            Rogue     6 12         Rogue          0        NA
4 Dwarf        Barbarian    15 18     Barbarian          0        NA
于 2020-03-15T21:37:30.017 回答
0

我们可以使用separate将 'Class' 拆分为两列('Primary_Class'、'Subclass'),方法是指定sep零个或多个空格 ( \\s*) 后跟|零个或多个空格 ( \\s*),然后通过检查是否创建 'Multiclass' “子类”NA元素

library(dplyr)
library(tidyr)
separate(df1, Class, into = c('Primary_Class', 'Subclass'),
      '\\s*\\|\\s*', extra = 'merge') %>%
     mutate(Multiclass = +(!is.na(Subclass)))
#   Race Primary_Class Subclass Level AC Multiclass
#1 Human       Fighter   Wizard    10 15          1
#2   Elf        Wizard     <NA>     8 10          0
#3 Human         Rogue     <NA>     6 12          0
#4 Dwarf     Barbarian     <NA>    15 18          0

数据

df1 <- structure(list(Race = c("Human", "Elf", "Human", "Dwarf"), 
   Class = c("Fighter | Wizard", 
"Wizard", "Rogue", "Barbarian"), Level = c(10L, 8L, 6L, 15L), 
    AC = c(15L, 10L, 12L, 18L)), class = "data.frame", row.names = c(NA, 
-4L))
于 2020-03-15T20:54:54.127 回答