1

I have two questions that to me seem related:

First, is it necessary to explicitly terminate Matlab in my sbatch command? I have looked through several online slurm tutorials, and in some cases the authors include an exit command:

http://www.umbc.edu/hpcf/resources-tara-2013/how-to-run-matlab.html

And in some they don't:

http://www.buffalo.edu/ccr/support/software-resources/compilers-programming-languages/matlab/PCT.html

Second, when creating a parallel pool in a job, I almost always get the following warning:

Warning: Found 4 pre-existing communicating job(s) created by pool that are running, and 2 communicating job(s) that are pending or queued. You can use 'delete(myCluster.Jobs)' to remove all jobs created with profile local. To create 'myCluster' use 'myCluster = parcluster('local')'

Why is this happening, and is there any way to avoid it happening to myself and to others because of me?

4

2 回答 2

1

这取决于您如何启动 Matlab。请注意,您的两个示例使用不同的方法来运行 matlab 脚本;第一个使用该-r选项

matlab -nodisplay -r "matrixmultiply, exit"

而第二个使用来自文件的标准输入重定向

matlab < runjob.m

在第一个解决方案中,Matlab 进程将在脚本完成后继续运行,这就是exit需要该命令的原因。在第二种解决方案中,stdin当到达文件末尾时,Matlab 进程在关闭时终止。

如果您不结束 matlab 进程,Slurm 将在达到最大分配时间时终止它,由--time您提交脚本中的选项或默认集群(或分区)值定义。

为避免您提到的警告,请确保matlabpool close在工作结束时系统地使用。如果您有多个 Matlab 实例在同一个节点上运行,并且您有一个共享的主目录,那么您可能会收到警告,因为我相信有关打开的 matlab 池的信息存储在您家中的隐藏文件夹中。重新启动可能无济于事,但找到这些文件并删除它们会(但要小心并询问系统管理员)。

于 2014-09-03T19:41:23.970 回答
0

为避免您的警告,您必须删除

.matlab/local_cluster_jobs/

目录

于 2015-05-22T12:48:00.707 回答