21

Well, I am trying to run serial MPI jobs masked as a one job on our supercomputer. The main submission script basically looks like that:

#!/bin/bash -l
#PBS -l nodes=4:ppn=8,walltime=24:00:00

cat $PBS_NODEFILE | uniq | tr '\\012' ' ' > tmp-$PBS_JOBID
read -a NODE < tmp-$PBS_JOBID
rm tmp-$PBS_JOBID

inode=-1
ijob=0

for ((K=1;K<=8;K++))
do
        [ $((ijob++ % 2)) -eq 0 ] && ((inode++))
        ssh ${NODE[inode]} _somepath_/RUN$K/sub.script &
done
wait
exit 0

Each sub.script looks like:

#!/bin/bash -l
#PBS -l walltime=24:00:00,nodes=1:ppn=4

module load intel
module load ompi
export FORT_BUFFERED=1

*run executable* 

wait
exit 0

And sometimes I encounter an error for each sub.script (jobs die immediately):

/bin/bash: -
: invalid option
Usage:  /bin/bash [GNU long option] [option] ...
        /bin/bash [GNU long option] [option] script-file ...
*etc.*

The most interesting thing is that it is a random error meaning if I run the same script for the second (or 3rd etc.) time it will run without any problems. Sometimes I'm lucky, sometimes I'm not... Removing -l won't help because in that case modules cannot be loaded and mpirun won't work. Any suggestions how to fix it?

Thanks a lot in advance!

4

2 回答 2

27

您的脚本中可能包含您看不到的字符。也许它是使用错误的字符集翻译复制/粘贴的,或者是 DOS 格式。对于后者,您可以使用 tofrodos 或 dos2unix 包进行更正。

无论哪种情况,您都可以在“vi”或其他通常会显示奇怪字符(如 ^@ 或 ^M)的应用程序中将其拉起。你可以试试cat -v filename这可能有助于看到这些奇怪的东西。迫不及待地尝试 hexdump(或 hd,或 od)。

于 2013-06-16T19:23:45.277 回答
4

我刚遇到这个,我有无效的行尾。我从 CRLF 更改为 LF 并修复了它!

于 2019-05-11T02:49:38.997 回答