问题标签 [fasta]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

804 问题

0 投票

1 回答

405 浏览

split - 读取蛋白质 fasta 文件并在 Arginine(R) 处拆分读取字符串，然后对肽进行爆炸以获得匹配项？

我有以下 fasta 文件：

我想遍历 FASTA ，在它遇到的所有“R”处拆分蛋白质序列，这将生成肽，然后对肽进行分解。从blastp 获取结果并将blastp 结果存储在fasta 文件中每个蛋白质ID 的单独文件中。我对使用什么语言并不特别。我想了解如何做到这一点，以便我可以在其之上构建更多功能。谢谢！

2013-06-07T23:42:25.430

0 投票

1 回答

160 浏览

python - 反复访问 LARGE fasta 文件。最高效的方法？

我正在使用 Biopython 打开一个大的单条目 fasta 文件（514 兆碱基），这样我就可以从特定坐标中提取 DNA 序列。返回序列的速度相当慢，我只是想知道是否有更快的方法来执行我还没有想到的这项任务。速度不会只是一两次点击的问题，但我正在遍历 145,000 个坐标的列表，这需要几天时间：/

python performance biopython fasta dna-sequence

2013-06-10T04:39:37.077

0 投票

4 回答

124 浏览

regex - 删除以唯一模式开头的行中的字符

我有一个由许多条目组成的文件，如下所示：

即以 > 开头的标题行和许多序列行，然后是标题行。我正在尝试编写一个 sed 脚本，该脚本仅转到以 > 开头的行（而不是序列行），并删除除前 10 个数字之外的所有数字。

有很多类似的问题，但我无法弄清楚。我一直在尝试对此代码的变体：

但显然我做得不对..

regex sed awk fasta

2013-06-10T20:53:06.233

0 投票

3 回答

306 浏览

regex - Grep word in one file, and use that word to match in FASTA file, adding the FASTA sequence to the first file

I want to grep several words in file1, and use each word to grep what follows after its match in file2.fasta. And then I want to add the thing that followed the match to the word I used into file03, so that file03 contains information from both files. Part of files I have are:

file1:

And a Fasta file (file2) like this:

The output I want is for this example:

As you can see, I simply want to add the FASTA sequence - which is contained in file2 – to file1. If anyone knows how to do this I would greatly appreciate it!

regex perl awk grep fasta

2013-06-13T15:55:53.740

0 投票

1 回答

844 浏览