0

I have a table, here's the start:

TargetID         SM_H1462   SM_H1463    SM_K1566    SM_X1567    SM_V1568   SM_K1534     SM_K1570    SM_K1571    
ENSG00000000419.8   290 270 314 364 240 386 430 329     
ENSG00000000457.8   252 230 242 220 106 234 343 321 
ENSG00000000460.11  154 158 162 136 64  152 206  432
ENSG00000000938.7   20106   18664   19764   15640   19024   18508   45590   32113

I want to write a code that will filter through the names of each column (the SM_... ones) and only look at the fourth character in each name. There are 4 different options that can appear at the 4th character: they can be letters H, K, X or V. This can be seen from the table above, e.g. SM_H1462, SM_K1571 etc. Names that have the letter H and K as the 4th character is the Control and names that have letters X or V as the 4th character is the Case.

I want the code to separate the column names based on the 4th letter and group them into two groups: either Case and Control.

Essentially, we can ignore the data for now, I just want to work with the col names first.

4

1 回答 1

3

您可以尝试检查第四个字符和 ger 大小写并将 aa 控制为两个单独的数据框,如果这对您有帮助

my.df <- data.frame(matrix(rep(seq(1,8),3), ncol = 8))
colnames(my.df) <- c('SM_H1462','SM_H1463','SM_K1566','SM_X1567', 'SM_V1568', 'SM_K1534', 'SM_K1570','SM_K1571')
my.df
control = my.df[,(substr(colnames(my.df),4,4) == 'H' | substr(colnames(my.df),4,4) == 'K')]
case = my.df[,(substr(colnames(my.df),4,4) == 'X' | substr(colnames(my.df),4,4) == 'V')]
于 2013-07-24T18:32:47.877 回答