0

我有一个样本名称列表

name <- c("GOM_13M_TB-01_S.HM (Q)30",
"GOM_13M_PS-06_S.HM (Q)30",
"GOM_13O_PS-06_3C_HM (Q)30",
"GOM_14O_GI-02_B3 (Q)30",
"GOM_14O_PS-03_A3 (Q)30",
"GOM_12J_GI-01_MS (Q)30")'

需要简化为

13M_TB-01_MS  (MS for consistency)
13M_PS-06_MS
13O_PS-06_3C  (I am not too concerned about the last 2 digits order)
14O_GI-02_B3
14O_PS-03_A3
12J_GI-01_MS

我已经尝试过 gsub() 的以下用法,但我正在尝试简化解决方案。

x <- gsub("GOM_", "", name) 
x <- gsub("\\(Q\\)30", "", x)
x <- gsub("_S", "_MS", x)
x <- gsub(".HM", "", x)

有什么建议么?

4

1 回答 1

3

也许您可以尝试以下方法:

gsub("GOM_(.*) .*", "\\1", gsub("S.HM", "MS", name))
# [1] "13M_TB-01_MS"    "13M_PS-06_MS"    "13O_PS-06_3C_HM" "14O_GI-02_B3"   
# [5] "14O_PS-03_A3"    "12J_GI-01_MS" 

也许:

## I think this matches what you're expecting...
substr(gsub("S.HM", "MS", name), 5, 16)
# [1] "13M_TB-01_MS" "13M_PS-06_MS" "13O_PS-06_3C" "14O_GI-02_B3"
# [5] "14O_PS-03_A3" "12J_GI-01_MS"
于 2015-12-17T16:53:41.947 回答