1

我在一个计算集群上工作,我有一个非常奇怪的 /usr/bin/env 行为......总之,它工作得非常慢。在头节点上:

$ time /usr/bin/env which
<which output>

real    0m0.025s
user    0m0.001s
sys     0m0.001s

在计算节点上:

$ qsub -I                                                                                                                
qsub: waiting for job 176620.scyld.localdomain to start
qsub: job 176620.scyld.localdomain ready

-bash-3.2$ time which
<which output>

real    0m0.003s
user    0m0.000s
sys     0m0.003s

-bash-3.2$ time /usr/bin/env /usr/bin/which

<which output>
real    0m0.003s
user    0m0.000s
sys     0m0.003s


-bash-3.2$ time /usr/bin/env which
<which output>

real    5m0.003s
user    0m0.001s
sys     0m0.001s

ps ax报告这个:

12884 pts/3    S+     0:00 /usr/bin/env which

打印使用横幅需要 5 分钟。任何想法为什么会发生这种情况?

编辑1:

关于哪个的附加信息:

-bash-3.2$ type -a which
which is /usr/bin/which
-bash-3.2$ file /usr/bin/which
/usr/bin/which: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), stripped
-bash-3.2$ echo $PATH
/bin:/usr/bin:/home/gusev/.rvm/bin:/home/gusev/bin

编辑 2

我已经strace/usr/bin/env which它卡在

execve("/bin/which", ["which"], [/* 47 vars */]

现在运行一个平原

/bin/which

也卡住了,但这个文件不存在:

-bash-3.2$ ls /bin/which
ls: /bin/which: No such file or directory

/bin安装在 NFS 上:

-bash-3.2$ mount | grep bin
10.54.0.1:/bin on /bin type nfs (nolock,nonfatal)
10.54.0.1:/usr/bin on /usr/bin type nfs (nolock,nonfatal)

所以这可能是一个网络问题......

编辑3:

which which工作得很好:

-bash-3.2$ time which which
/usr/bin/which

real    0m0.002s
user    0m0.000s
sys     0m0.002s

的输出strace -e trace=execve /usr/bin/env which

execve("/usr/bin/env", ["/usr/bin/env", "which"], [/* 47 vars */]) = 0
execve("/bin/which", ["which"], [/* 47 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/bin/which", ["which"], [/* 47 vars */]) = 0
<which output>

编辑4:

挂起时间总是 5 分钟。看起来这是某种默认值超时。

4

3 回答 3

0

导致问题的可能是which命令,而不是env命令。

由于您看到的结果非常不同

time /usr/bin/env /usr/bin/which

对比

time /usr/bin/env which

您的 中可能还有另一个which命令$PATH,也许在/usr/local/binor中$HOME/bintype -a which告诉你什么?你$PATH看起来像什么?

请注意,它which可以是 shell 脚本或可执行文件。如果它是一个 shell 脚本,请尝试获取它的副本并添加set -x以查看它在做什么。

于 2012-12-17T09:38:13.207 回答
0

此问题以及您之前的问题中描述的问题似乎是由于execve需要很长时间才能返回您的计算笔记造成的。路径中的目录是 NFS 安装的事实可能是一个促成因素。

通过运行命令strace,我们看到env使用重复调用execve来探测每个路径中是否存在命令:

[me@home]$ echo $PATH
/home/me/bin:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/me/work/bin

[me@home]$ strace -e execve /usr/bin/env which
execve("/usr/bin/env", ["/usr/bin/env", "which"], [/* 53 vars */]) = 0
execve("/home/me/bin/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/lib/lightdm/lightdm/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/local/sbin/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/local/bin/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/sbin/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/bin/which", ["which"], [/* 53 vars */]) = 0

正如您在上面的评论中确认的那样,which which不会遇到同样的问题,那是因为它使用stat而不是execve探测路径:

[me@home]$ strace -e execve,stat /usr/bin/which which
execve("/usr/bin/which", ["/usr/bin/which", "which"], [/* 53 vars */]) = 0
stat("/home/me", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/home/me/bin/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/lib/lightdm/lightdm/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/local/sbin/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/bin/which", {st_mode=S_IFREG|0755, st_size=946, ...}) = 0
/usr/bin/which

恐怕无法提供任何建议来解决根本问题,但与此同时,您可以通过以下方式解决该问题:

  1. 使用命令的完整路径而不是env为您解析它们。
  2. 如果您真的想使用env,请尽可能重新排序$PATH以最小化搜索。例如:

    PATH=/usr/bin:$PATH /usr/bin/env which   # place most likely path first
    
于 2012-12-17T11:00:41.133 回答
0

最后,我发现我有一个很长的PATH环境变量。并且可能它以某种方式影响execve了 NFS 共享的调用。

因此,我将一堆可执行文件移到了一个signle 目录中,并用PATH一个条目替换了其中的许多条目。从那以后我没有遇到任何问题。

于 2013-01-28T10:17:04.323 回答