1

我在 R 中创建了以下矩阵:

positions = cbind(seq(from = 20, to = 68, by = 4),seq(from = 22, to = 70, by = 4))

我还有以下字符串:

"SEQRES   1 L   36  THR PHE GLY SER GLY GLU ALA ASP CYS GLY LEU ARG PRO          "

我正在尝试使用应用函数来制作一个子字符串列表(mystring,start.position,end.position),其中第一个索引来自位置 [,1],第二个来自位置 [,2]。我可以使用 for 循环轻松完成此操作,但我认为 apply 会更快。

我可以让它按如下方式工作,但我想知道是否有更清洁的方法:

parse.me = cbind(seq(from = 20, to = 68, by = 4),seq(from = 22, to = 70, by = 4), input)
apply(parse.me, MARGIN = 1, get.AA.seqres)

get.AA.seqres <- function(items){
start.position = as.numeric(items[1])
end.position = as.numeric(items[2])
string = items[3]
return (substr(string, start.position, end.position)  )
}
4

2 回答 2

3

尝试这个:

> substring(input, positions[, 1], positions[, 2])
 [1] "THR" "PHE" "GLY" "SER" "GLY" "GLU" "ALA" "ASP" "CYS" "GLY" "LEU" "ARG" "PRO"
于 2012-05-28T18:29:49.660 回答
0

我喜欢 Andrie 的实用建议,但如果您出于其他原因需要走这条路,您的问题听起来可以通过以下方式解决Vectorize()

#Your data
positions = cbind(seq(from = 20, to = 68, by = 4),seq(from = 22, to = 70, by = 4))
input <- "SEQRES   1 L   36  THR PHE GLY SER GLY GLU ALA ASP CYS GLY LEU ARG PRO          "

#Vectorize the function substr()
vsubstr <- Vectorize(substr, USE.NAMES = FALSE)
vsubstr(input, positions[,1], positions[,2])
#-----
[1] "THR" "PHE" "GLY" "SER" "GLY" "GLU" "ALA" "ASP" "CYS" "GLY" "LEU" "ARG" "PRO"

#Or, read the help page on ?substr about the bit for recycling in the first paragraph of details

substr(rep(input, nrow(positions)), positions[,1], positions[,2])
#-----
[1] "THR" "PHE" "GLY" "SER" "GLY" "GLU" "ALA" "ASP" "CYS" "GLY" "LEU" "ARG" "PRO"
于 2012-05-28T18:07:09.050 回答