r - R包猿：提取密码子前两个核苷酸

Question

我有一个包含 DNA seq 的 fasta 文件。我想删除每个密码子中的第三个核苷酸。我想我可以在子集步骤中选择前 2 个核苷酸。

我在 R 中工作，使用 ape 和 seqinr 包

>read.dna("test3", format="fasta")-> test3
>test3
1 DNA sequences in binary format stored in a matrix.

All sequences of same length: 888 

Labels: XX_00004 

Base composition:
    a     c     g     t 
0.223 0.222 0.293 0.262

使用该功能seq，我可以在每个密码子中单独选择第一个、第二个和第三个核苷酸，但我不能选择第一个和第二个。

>test3[seq(1, length(test3), by = 3)]
1 DNA sequence in binary format stored in a vector.

Sequence length: 296 

Base composition:
    a     c     g     t 
0.256 0.249 0.374 0.121
>test3[seq(1:2, length(test3), by = 3)]
Error in seq.default(1:2, length(test3), by = 3) : 
  'from' must be of length 1

> test3[seq(from=1, to=2, length(test3), by = 3)]
Error in seq.default(from = 1, to = 2, length(test3), by = 3) : 
  too many arguments

任何建议如何做到这一点？

score 2 · Accepted Answer

您可以通过排除第三个来选择第一个和第二个：

test3[-seq(3, length(test3), by = 3)]

r - R包猿：提取密码子前两个核苷酸

1 回答 1

Related

Reference