linux - tee 打印到标准输出的顺序是否得到保证？

Question

可以tee在linux下使用命令拆分管道，如下

printf "line1\nline2\nline3\n" | tee >(wc -l ) | (awk '{print "this is awk: "$0}')

产生输出

this is awk: line1
this is awk: line2
this is awk: line3
this is awk: 3

我的问题是，打印顺序是否有保证？计算行数的tee拆分管道是否总是在最后打印？有没有办法总是在开始时打印它？还是tee无法保证打印的顺序？

score 2 · Accepted Answer

它不是由 tee 定义的，但正如 Daenyth 所说，在 tee 完成传递数据之前 wc 不会完成 - 所以通常 tee 到那时也会将它传递给 awk 。在这种情况下，最好让 awk 进行计数。

echo -ne {one,two,three,four}\\n | \
awk '{print "awk processing line " NR ": "$0} END {print "Awk saw " NR " lines"}'

缺点是它在完成之前不会知道数字（知道它需要缓冲数据）。在您的示例中， tee 和 wc 都将标准输出连接到同一管道（awk 的标准输入），但顺序未定义。cat（和大多数其他管道工具）可用于以已知顺序组装文件。

可以使用更高级的管道技术，例如 bash 协进程 (coproc) 或命名管道 (mkfifo 或 mknod p)。后者在文件系统中为您提供名称，这些名称可以传递给其他进程，但您必须清理它们并避免冲突。tempfile 或 $$ 可能对此有用。管道不用于缓冲数据，因为它们通常大小有限并且只会阻塞写入。

管道是错误解决方案的示例：

mkfifo wcin wcout
wc -l < wcin > wcout &
yes | dd count=1 bs=8M | tee wcin | cat -n wcout - | head

这里的问题是 tee 在尝试向 cat 写东西时会卡住，而 cat 想先用 wcout 完成。从 tee 到 cat 的管道数据太多了。

编辑关于 dmckee 的回答：是的，订单可能是可重复的，但不能保证。这是规模、调度和缓冲区大小的问题。在这个 GNU/Linux 机器上，这个例子在几千行之后开始分解：

seq -f line%g 20000 | tee >(awk '{print "*" $0 "*"}' ) | \
(awk '{print "this is awk: "$0}') | less
this is awk: line2397
this is awk: line2398
this is awk: line2*line1*
this is awk: *line2*
this is awk: *line3*

score 1 · Accepted Answer

我怀疑在这种情况下，wc它正在等待 EOF，因此在第一个命令完成发送输入之前它不会返回（或打印输出），而 awk 逐行执行，因此总是首先打印。我不知道它是否在发送到其他进程时定义。

为什么不在打印行本身之前让 awk 计算行数？

score 0 · Accepted Answer

~~我不认为你可以指望它。这里wc运行在一个单独的进程中，所以没有同步。~~我的试运行表明它可能是（至少在 bash 中）。正如Daenyth 解释的那样，这种特殊情况很特殊，但请尝试使用grep -o line而不是wc看看你会得到什么。

也就是说，在我的 MacBoox 上，我得到：

$ printf "line1\nline2\nline3\nline4\nline5\n" | tee >(grep -o line ) | (awk '{print "this is awk: "$0}')
this is awk: line1
this is awk: line2
this is awk: line3
this is awk: line4
this is awk: line5
this is awk: line
this is awk: line
this is awk: line
this is awk: line
this is awk: line

非常一致。我必须非常仔细地阅读 bash 手册页才能确定。

相似地：

$ printf "line1\nline2\nline3\nline4\nline5\n" | tee >(awk '{print "*" $0 "*"}' ) | (awk '{print "this is awk: "$0}')
this is awk: line1
this is awk: line2
this is awk: line3
this is awk: line4
this is awk: line5
this is awk: *line1*
this is awk: *line2*
this is awk: *line3*
this is awk: *line4*
this is awk: *line5*

每次……和

$ printf "line1\nline2\nline3\nline4\nline5\n" | tee >(awk '{print "*" $0 "*"}' ) | (grep line)
line1
line2
line3
line4
line5
*line1*
*line2*
*line3*
*line4*
*line5*

linux - tee 打印到标准输出的顺序是否得到保证？

3 回答 3

Related

Reference