“qsub”的相关标签问题_Stack Overflow中文网

0 投票

2 回答

4914 浏览

mpi - 在 PBS 脚本中确定 qsub 之后的总 CPU 计数

对于使用 qsub 调用的 PBS 脚本，我想知道实际分配了多少 CPU，以防 PBS 文件中定义的数字被命令行输入覆盖。例如使用以下 pbs 脚本文件：

作业脚本.pbs：

该脚本可以使用以下命令行仅使用 16 个 CPU（而不是 32 个）运行：

因此，我想要一种可靠的方法来确定脚本中实际可用的 CPU 数量。

2013-07-23T08:03:08.277

0 投票

1 回答

219 浏览

python - Parallel Processing using subprocess in python2.4

I want to calculate a statistic over all pairwise combinations of the columns of a very large matrix. I have a python script, called jaccard.py that accepts a pair of columns and computes this statistic over the matrix.

On my work machine, each calculation takes about 10 seconds, and I have about 95000 of these calculations to complete. However, all these calculations are independent from one another and I am looking to use a cluster we have that uses the Torque queueing system and python2.4. What's the best way to parallelize this calculation so it's compatible with Torque?

I have made the calculations themselves compatible with python2.4, but I am at a loss how to parallelize these calculations using subprocess, or whether I can even do that because of the GIL.

The main idea I have is to keep a constant pool of subprocesses going; when one finishes, read the output and start a new one with the next pair of columns. I only need the output once the calculation is finished, then the process can be restarted on a new calculation.

My idea was to submit the job this way

myjob.sh would invoke a main python file that looks like the following:

Any advice on to how to best do this? I have never used Torque and am unfamiliar with subprocessing in this way. I tried using multiprocessing.Pool on my workstation and it worked flawlessly with Pool.map, but since the cluster uses python2.4, I'm not sure how to proceed.

EDIT: Actually, on second thought, I could just write multiple qsub scripts, each only working on a single chunk of the 95000 calculations. I could submit something like 16 different jobs, each doing 7125 calculations. It's essentially the same thing.

python subprocess python-2.4 qsub torque

2013-08-01T21:43:15.543

0 投票

2 回答

815 浏览

sed - qsub 内部的 awk

我有一个 bash 脚本，其中有几个 qsub。他们每个人都在等待之前的 qsub 在开始之前完成。

我的第一个 qsub 包括将某个目录中的文件发送到 perl 程序，并将输出文件打印在新目录中。最后，我用我所有的工作名称回显数组。该脚本按预期工作。

我的第二个 qsub 旨在将我以前用我的 perl 脚本制作的所有文件排序到一个新的 outfile 中，并在所有这些工作完成（大约 100 个工作）后开始使用depend=afterany. 同样，这部分工作正常。

我的问题是，在我的排序文件中，我有几列我想删除（2 到 6），所以我想出了最后一行，使用awkpiped tosed和另一个depend=afterany

最后一步创建final_file.txt，但将其留空。我SED=在 echo 之前添加了，否则它会给我Command not found.

我尝试不使用管道，所以它只会打印所有内容。不幸的是，它什么也没打印。我认为它没有打开我的排序文件，这就是为什么我的最终文件在我的 sed 之后是空的。如果是这样，那为什么 awk 不读取呢？

在我的脚本中，我使用变量来定义我的目录和文件（使用正确的路径）。我知道我的问题不在于找到我的文件或目录，因为它们在开始时已被完美定义并在整个脚本中使用。我试图写整个路径而不是变量，我得到了相同的结果。

sed awk echo qsub

2013-08-02T19:34:18.153

0 投票

3 回答

9479 浏览

linux - 如何在 qsub 命令中使用管道或重定向？

我想使用 qsub（SGE 8.1.3，CentOS 5.9）在网格上运行一些命令，这些命令需要使用管道（|）或重定向（>）。例如，假设我必须并行化命令

（显然是一个简化的例子：实际上我可能需要将像 bowtie 这样的程序的输出直接重定向到samtools ）。如果我这样做了：

的结果内容hello.txt看起来像

同样，如果我使用管道 ( echo "hello world" | myprogram)，则该消息将传递给myprogram，而不是实际的标准输出。

我知道我可以编写一个小的 bash 脚本，每个脚本都包含带有管道/重定向的命令，然后执行qsub ./myscript.sh. 但是，我正在尝试使用脚本同时运行许多并行化作业，因此我必须编写许多这样的 bash 脚本，每个脚本都使用稍微不同的命令。在编写此解决方案的脚本时，可能会开始感觉非常骇人听闻。Python中此类脚本的示例：

这令人沮丧有几个原因，其中我的程序甚至无法删除许多jobXX.sh脚本以自行清理，因为我不知道作业将在队列中等待多长时间，并且脚本必须工作开始时到场。

有没有办法将我的完整echo 'hello world' > hello.txt命令提供给 qsub ，而不必创建另一个包含该命令的文件？

linux pipe qsub grid-computing sungridengine

2013-08-19T20:12:14.800

0 投票

1 回答

2535 浏览

qsub - 如何避免此错误？无法运行作业：错误：没有合适的队列

我有以下命令来提交多个作业

当我尝试提交 2 或 3 个工作时，它工作正常。但是当作业数大于 30 时，它在提交过程中失败并出现以下错误

无法运行作业：错误：没有合适的队列。

这可能是由网格设置引起的，在这种情况下我该怎么办。

qsub

2013-08-23T12:12:29.493

0 投票

2 回答

1842 浏览

linux - pass variable script argument to another script then qsub to program

After reading through numerous bash script threads and help sites, I cannot find a solution that works.

I want to pass a variable argument 'i' from a script to another script $i, then qsub this to a program "$1". In the program I read the variable from the argument vector (**argv) and then use this variable to modify the name of output files as *_0, *_1, *_2, ..., *_n.

The idea is so I can have a unique output file for each instance of a program. The program is parallel but due to limitations of the computing resources, I need to submit one job for a maximum of four computing nodes - or it will never pass through the que. So, I'd like to spin off 64 4-node jobs.

So far I have read topic on:

"-C option" Passing arguments to /bin/bash via a bash script
"pass arguments" http://linux.about.com/od/Bash_Scripting_Solutions/a/How-To-Pass-Arguments-To-A-Bash-Script.htm
"start with argument $0 or #1" How can I pass a file argument to my bash script using a Terminal command in Linux?
"pass arguments" http://how-to.wikia.com/wiki/How_to_read_command_line_arguments_in_a_bash_script
the exact same question I am asking but the answers don't really work for my case Using a loop variable in a bash script to pass different command-line args

After reading these, I feel comfortable with the concept but still it is confusing how exactly the -C and -S command are used, or if they are used at all; most examples exclude these.

This is my spinoff pre-script

side info: what is qsub

And this is my script

So, the spinoff works fine, and generates the files. And, the script works fine passing a constant ./daedalus_linux_1.3_64 1 1 but passing the variable does not work. I do not know if the prescript correctly passes variable i to the script. I don't know how to write to a error file from a script - or if this is even how I want to check if the variable is passed. The computing has no user interface so once it is in the queue I must rely on error file outputs.

Thank you in advance for your help.

linux shell argv hpc qsub

2013-08-25T04:17:42.163

0 投票

1 回答

709 浏览

bash - bash 脚本 PBS_ARRAYID 变量参数不 qsub 到作业

我想通过 qsub 将 PBS_ARRAYID 传递给主参数向量（argv），但是在阅读了谷歌结果页面中的每个返回之后 - 我无法让它工作。一个常数参数 qsubs 很好。

我从这里给出的解决方案中提取了 Array 代码Using a loop variable in a bash script to pass different command-line args

从我读过的所有内容来看，这应该可行。它确实有效，除了var1=$(echo "$PBS_ARRAYID" -l)

bash argv hpc qsub

2013-08-25T08:55:40.463

0 投票

1 回答

6378 浏览

shell - 等待用户的所有作业完成，然后再将后续作业提交到 PBS 集群

我正在尝试调整一些 bash 脚本以使它们在 ( pbs ) 集群上运行。

各个任务由几个脚本执行，这些脚本由一个主脚本启动。到目前为止，这个主脚本在后台启动了多个脚本（通过附加&），使它们在一台多核机器上并行运行。我想用qsubs 替换这些调用以在集群节点之间分配负载。

但是，有些工作需要其他工作完成才能开始。到目前为止，这是通过wait主脚本中的语句实现的。但是，使用 Grid Engine 执行此操作的最佳方法是什么？

我已经在手册页中找到了这个问题以及-W after:jobid[:jobid...]文档，qsub但我希望有更好的方法。我们正在谈论首先并行运行的几个 thound 作业，然后在最后一个完成后同时运行另一组相同大小的作业。这意味着我必须根据很多工作排队很多工作。

我可以通过在两者之间使用一个虚拟工作来解决这个问题，除了依赖第一组工作之外什么都不做，第二组可以依赖这些工作。这会将依赖项的数量从数百万减少到数千，但仍然：感觉不对，我什至不确定 shell 是否会接受这么长的命令行。

有没有办法等待我所有的工作完成（比如qwait -u <user>）？
或者从这个脚本提交的所有作业（类似qwait [-p <PID>]）？

当然，使用qstatandsleep在while循环中编写这样的东西是可能的，但我想这个用例很重要，足以拥有一个内置的解决方案，而我只是无法弄清楚这一点。

在这种情况下，您会推荐/使用什么？

附录一：

由于它是在评论中要求的：

也许也有助于确定确切的pbs系统：

由于到目前为止的注释指向作业数组，因此我在qsub手册页中搜索了以下结果：

附录二：

我已经尝试过 Dmitri Chubarov 给出的扭矩解决方案，但它不像描述的那样工作。

如果没有作业阵列，它会按预期工作：

但是，使用作业数组第二个作业不会开始：

我猜这是由于第一个返回的作业 id 中缺少数组指示qsub：

如您所见，没有...[]表明这是一个作业数组。此外，在qsub输出中没有...[]s 但...-1表示...-2数组。

所以剩下的问题是如何格式化-W depend=afterok:...以使作业依赖于指定的作业数组。

shell cluster-computing wait pbs qsub

2013-08-26T10:43:29.880

0 投票

2 回答

1199 浏览

qsub - 确定 qsub 中的负载状态

我正在用 python 编写脚本，向 qsub 提交多个作业，但我们需要确定 qsub 上的负载。如果队列中有更多作业或 qsub 上的负载高，我需要通知用户并运行作业本地环境。我已经检查了命令页面，但无法获得有用的信息。

qsub

2013-09-14T12:52:05.537

0 投票

1 回答

3996 浏览

cpu-usage - SGE 节点的负载能否超过 CPU 的数量？

我正在 Sun Grid Engine（现在称为 Oracle Grid Engine）集群上运行作业。为了查看我的工作是否因为节点超载而变慢，我尝试检查节点的状态：

和

现在，load_avg是 103.41，而NCPU只有 64。这应该发生吗？某些作业是否使用 CPU 而不是分配给它们的插槽？

更新：响应查询，配置被上传到http://pastebin.com/hLnJBetS。

cpu-usage administration slots qsub sungridengine

2013-09-16T01:51:58.853

问题标签 [qsub]

Reference