问题标签 [blast]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

228 问题

0 投票

1 回答

132 浏览

xml - Is there any way to get (a lot of) values from a file to filter a big xml using xmlstarlet?

everyone. I'm trying to filter a big xml file (from a BLAST) to keep only some <Interaction> nodes defined by a list of <Iteration_iter-num> values that I define from a file. Here is a simplified example (the real Blast.xml have more than 80000 Iterations):

and I have a file with the iterations to keep (saved as keep_iter):

For this kind of low scale problem I managed to do the filtering with xmlstarlet, creating first a version of the file to store the string for the comparison (saved as filter):

This works as a charm with:

Basically, I removed all the Iteration nodes that were not in the filter file Obtaining:

The problem is that I really have a keep_iter file with 20000 values to filter. When I create the filter file and run the xmlstarlet command above, the argument is obviously too long.

Any suggestion to filter such a Blast.xml file to keep only those Iteration nodes which iteration number is listed in the keep_iter file (with 20k values)? I want to keep the original xml structure.

2013-10-08T09:55:37.940

0 投票

1 回答

777 浏览

apache - Web Blast Apache 配置（错误 403）

我正在尝试通过 Easy PHP Dev Server 13.1（Apache 2.2，Windows 7）在本地运行 Web Blast 2.2.28+，但是当我单击搜索时，当我在我的站点中打开它时显示 ERROR 403 或出现以下消息（ blast.cgi 内容），当我直接使用 blast.html 页面时：

Apache 安装在“C:\Program Files (x86)\EasyPHP-DevServer-13.1VC9\binaries\apache\bin”并配置如下（httpd.conf）：

blast.html 文件位于“C:\Program Files (x86)\EasyPHP-DevServer-13.1VC9\data\localweb\original\cgi-bin”。

在 Apache Access 日志中，我发现：

并在 Apache 错误日志中：

我真的不太了解 Apache，但我没有在 httpd.conf 中激活 ExecCGI 吗？我很困惑。

我一直在尝试我能想象到的所有组合（更改路径、文件名、apache 选项序列），但是……什么都没有。

有人可以帮助我吗？我非常需要，自 2012 年以来我一直在努力让它运行。

谢谢大家，对任何语言问题感到抱歉=）。迪米特里乌斯

apache blast

2013-10-08T23:34:33.330

0 投票

2 回答

190 浏览

python - 逐行处理文件

我正在处理一个大的 BLAST 文件和一个大的 FASTA 文件，我需要为一个 BLAST 块加载几行 FASTA（假设它是一行）。

我希望在 BLAST 的第二个循环（行）中，它将在最后处理的 FASTA 行的下一行继续，但它正在加载所有相同的 FASTA 行。为什么？我怎样才能加载下一行？真的有必要添加一些索引吗？

FASTA 具有典型的格式：

我需要下一个 BLAST 序列的每一行的每个序列。

python fasta blast

2013-11-06T21:54:20.350

0 投票

4 回答

5396 浏览

windows - blastdbcmd - 太多位置参数 <1>，违规值：%f

我正在尝试使用 blastdbcmd - 当我在 cmd 上键入以下内容时

弹出以下错误：

我输入%f的是 Fasta 格式，即使在互联网上花费了这么多时间来弄清楚，我仍然不知道如何解决这个错误。你能帮我吗？

windows cmd bioinformatics fasta blast

2013-11-20T02:39:20.667

0 投票

1 回答

120 浏览

search - 查找与目标序列不匹配的序列

Biostar的 Rnaer提出了一个有趣的问题：

我想找到与 C.elegans 基因组的任何区域都不匹配的给定长度（例如 30nt）的独特 dna/蛋白质序列。有什么工具可以做到这一点吗？

search bioinformatics dna-sequence blast ncbi

2013-11-26T02:49:59.703

0 投票

2 回答

2453 浏览

python - 如何使用 Biopython 将多个序列上传到 BLAST？

我正在尝试从单个 FASTA 文件中运行多个序列的 BLASTN 搜索。我可以轻松地从文件中查询单个序列，但很难查询一个文件中的所有序列。由于这些是相对较短的读取，我宁愿不将文件拆分为单独的序列并分别查询每个序列。

这是我到目前为止所尝试的：

有人有什么想法吗？

python sequences biopython fasta blast

2013-12-06T23:21:31.103

0 投票

3 回答

120 浏览

linux - 如何用file2中的相同编号替换相同编号的file1

我有一个查询列表并在一个文件 (file1) 中点击 gi。我有另一个文件，其中有完整的命中名称（file2），现在我想将命中 gi 从 file1 替换为具有完整命中名称的 file2。我希望 gi 必须在每个对应的查询前面用相同的 gi 替换。

文件 1

文件2

所需的输出：

linux bioinformatics biopython bioperl blast

2013-12-30T07:44:16.047

0 投票

1 回答

169 浏览

perl - Perl 的 system() “暂停”。由 $ARGV[] 引起？

将 BLAST 命令组合到 perl 脚本时，我被卡住了。问题是第二部分开始时命令行暂停了。

PART I 用于裁剪 fasta 序列。PART II 用于对 PART I 生成的文件进行 BLAST。两个部分单独运行都可以，但组合在一起时遇到了“暂停”问题。

我猜是因为第一部分生成的 $ARGV[1] 和 $ARGV[3] 不能在第二部分中使用。我不知道如何解决，虽然我尝试了很多。

谢谢！

perl system argv blast

2014-01-05T04:38:59.690

0 投票

3 回答

1440 浏览

perl - perl脚本从blast结果中选择最佳blast基因系

我有带有爆炸输出的巨大文件，我需要选择查询 ID、主题 gi 和框架（基本上是整行），其中 e 值最低，省略重复行（省略所有其他具有其他更高 e 值的行）。这是文件的样子：

在这种情况下，预期的输出应该只有这两行，因为它们是具有最小 e_value 的：

我已经编写了代码，但似乎不起作用。你们能帮我解决这个问题吗？我非常感谢您的时间和帮助。这是我到目前为止所拥有的：

perl output bioinformatics blast

2014-01-08T18:45:48.723

0 投票

1 回答

636 浏览

xml - 一次命中下多个 HSP 是什么意思？

我是生物信息学领域的新手。我正在查看一个 BLAST xml 输出文件，并试图了解为什么每个爆炸命中下都有多个 HSP。我知道 HSP 代表 High-Scoring Segment Pair，但我真的不明白如何以及为什么将多个 HSP 分配给单个命中。

xml bioinformatics hit blast

2014-01-17T23:35:03.340

1 2 3 4 5 6 7 8 9 10

问题标签 [blast]

Reference