您的原始代码按顺序而不是同时运行子进程,因为您wait()
在循环中进行了调用。
您不需要复制程序名称。您可以argv[1]
直接使用(或简单地将其分配给nameExec
)或使用 跳过前几个字符nameExec = &argv[1][2];
。
理解代码中循环的操作非常棘手;当我试图将我的大脑包裹在它周围时,它让我尖叫了几次。我将简单地从头开始编写代码——有两种变体。
变体 1
更容易理解的变体是父(初始)进程为每个计数器启动一个子进程,然后等待直到没有子进程。它在孩子退出时报告他们的PID和退出状态;简单地收集尸体而不打印“in memoriam”是可行的。
/* SO 6021-0236 */
/* Variant 1: Original process forks children and waits for them to complete */
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
int main(int argc, char **argv)
{
assert(argc > 2);
/* Launch children */
for (int i = 2; i < argc; i++)
{
if (fork() == 0) // child process
{
execl(argv[1], argv[1], argv[i], (char *)0);
fprintf(stderr, "failed to execute %s\n", argv[1]);
exit(EXIT_FAILURE);
}
}
/* Wait for children */
int corpse;
int status;
while ((corpse = wait(&status)) > 0)
{
printf("%d: PID %d exited with status 0x%.4X\n",
(int)getpid(), corpse, status);
}
return 0;
}
我重命名了您的计数器程序,因此源文件counter23.c
和程序是counter23
,唯一的其他重大更改删除了printf()
输出中冒号之前的空格。
上面我调用了源代码multiple43.c
,编译成multiple43
。
$ multiple43 count23 1
54251: start
54251: 1
54251: done
54250: PID 54251 exited with status 0x0000
$ multiple43 count23 3 4 5
54261: start
54261: 5
54260: start
54260: 4
54259: start
54259: 3
54261: 4
54260: 3
54259: 2
54261: 3
54260: 2
54259: 1
54261: 2
54260: 1
54259: done
54258: PID 54259 exited with status 0x0000
54261: 1
54260: done
54258: PID 54260 exited with status 0x0000
54261: done
54258: PID 54261 exited with status 0x0000
$
在三个孩子的运行中,您可以看到所有三个都同时产生输出。
这是我认为你应该使用的变体,除非有明确要求做其他事情。
变体 2
另一个变体或多或少近似于您的代码(尽管近似不是很好),因为原始进程本身也执行计数器程序。3 4 5
因此,如果原始进程的周期比其他进程少,它会在其他进程完成之前终止(参见和5 4 3
示例之间的区别)。不过,它确实同时运行计数器。
/* SO 6021-0236 */
/* Variant 2: Original process launches children, the execs itself */
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
int main(int argc, char **argv)
{
assert(argc > 2);
/* Launch children */
for (int i = 3; i < argc; i++)
{
if (fork() == 0) // child process
{
execl(argv[1], argv[1], argv[i], (char *)0);
fprintf(stderr, "failed to execute %s\n", argv[1]);
exit(EXIT_FAILURE);
}
}
execl(argv[1], argv[1], argv[2], (char *)0);
fprintf(stderr, "failed to execute %s\n", argv[1]);
return(EXIT_FAILURE);
}
此代码multiple53.c
编译为multiple53
.
$ multiple53 count23 3 4 5
54269: start
54268: start
54267: start
54269: 5
54268: 4
54267: 3
54269: 4
54268: 3
54267: 2
54268: 2
54267: 1
54269: 3
54268: 1
54267: done
54269: 2
$ 54268: done
54269: 1
54269: done
$ multiple53 count23 5 4 3
54270: start
54272: start
54270: 5
54272: 3
54271: start
54271: 4
54270: 4
54272: 2
54271: 3
54272: 1
54270: 3
54271: 2
54271: 1
54272: done
54270: 2
54270: 1
54271: done
54270: done
$
空白行出现是因为我按了回车键——提示出现在前 3 行,但随后是 54268 和 54269 的更多输出。我认为这不太可能是我们想要的。
仪表化变体 0
为了尝试理解原始代码,我在进行了一些小的更改(保存multiple31.c
并编译为multiple31
)后对其进行了检测:
/* SO 6021-0236 */
/* Original algorithm with instrumentation */
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>
int main(int argc, char **argv)
{
assert(argc > 2);
char *nameExec = argv[1];
char *time;
int number = argc - 2;
if (number == 1)
{
printf("%d: name = %s; time = %s\n", (int)getpid(), nameExec, argv[2]);
execl(argv[1], nameExec, argv[2], NULL);
}
else
{
for (int i = 2; i <= number; i++) // Idempotent change in condition
{
printf("%d: i = %d; number = %d\n", (int)getpid(), i, number);
pid_t kid = fork();
if (kid == 0)
{
time = argv[i];
printf("%d: i = %d; time = %s; ppid = %d\n",
(int)getpid(), i, time, (int)getppid());
}
else
{
time = argv[i + 1];
printf("%d: i = %d; time = %s; waiting for %d\n",
(int)getpid(), i, time, (int)kid);
int status;
int corpse = wait(&status);
printf("%d: i = %d; time = %s; PID %d exited with status 0x%.4X\n",
(int)getpid(), i, time, corpse, status);
}
}
printf("%d: name = %s; time = %s\n", (int)getpid(), nameExec, time);
execl(argv[1], nameExec, time, NULL);
}
printf("%d: this should not be reached!\n", (int)getpid());
return 0;
}
当运行 4 次时,它会产生如下输出:
$ multiple31 count23 5 4 3 2
54575: i = 2; number = 4
54575: i = 2; time = 4; waiting for 54576
54576: i = 2; time = 5; ppid = 54575
54576: i = 3; number = 4
54576: i = 3; time = 3; waiting for 54577
54577: i = 3; time = 4; ppid = 54576
54577: i = 4; number = 4
54577: i = 4; time = 2; waiting for 54578
54578: i = 4; time = 3; ppid = 54577
54578: name = count23; time = 3
54578: start
54578: 3
54578: 2
54578: 1
54578: done
54577: i = 4; time = 2; PID 54578 exited with status 0x0000
54577: name = count23; time = 2
54577: start
54577: 2
54577: 1
54577: done
54576: i = 3; time = 3; PID 54577 exited with status 0x0000
54576: i = 4; number = 4
54576: i = 4; time = 2; waiting for 54579
54579: i = 4; time = 3; ppid = 54576
54579: name = count23; time = 3
54579: start
54579: 3
54579: 2
54579: 1
54579: done
54576: i = 4; time = 2; PID 54579 exited with status 0x0000
54576: name = count23; time = 2
54576: start
54576: 2
54576: 1
54576: done
54575: i = 2; time = 4; PID 54576 exited with status 0x0000
54575: i = 3; number = 4
54575: i = 3; time = 3; waiting for 54580
54580: i = 3; time = 4; ppid = 54575
54580: i = 4; number = 4
54580: i = 4; time = 2; waiting for 54581
54581: i = 4; time = 3; ppid = 54580
54581: name = count23; time = 3
54581: start
54581: 3
54581: 2
54581: 1
54581: done
54580: i = 4; time = 2; PID 54581 exited with status 0x0000
54580: name = count23; time = 2
54580: start
54580: 2
54580: 1
54580: done
54575: i = 3; time = 3; PID 54580 exited with status 0x0000
54575: i = 4; number = 4
54575: i = 4; time = 2; waiting for 54582
54582: i = 4; time = 3; ppid = 54575
54582: name = count23; time = 3
54582: start
54582: 3
54582: 2
54582: 1
54582: done
54575: i = 4; time = 2; PID 54582 exited with status 0x0000
54575: name = count23; time = 2
54575: start
54575: 2
54575: 1
54575: done
$
追踪为什么这是输出是可怕的。我开始写解释,但我发现我的解释与实际输出不符——又一次。然而,我通常如何理解所发生的事情。关键点之一(稍微简化一下)是,除了一个正在倒计时的孩子之外,一切都在等待一个孩子死去。用 1 次、2 次或 3 次而不是 4 次运行测试与此一致,但更简单(同时更少混乱和更混乱)。使用 5 倍会增加输出量,但并不能真正提供更多的启发。