83

我想将进程 proc1 的标准输出重定向到两个进程 proc2 和 proc3:

         proc2 -> stdout
       /
 proc1
       \ 
         proc3 -> stdout

我试过

 proc1 | (proc2 & proc3)

但它似乎不起作用,即

 echo 123 | (tr 1 a & tr 1 b)

 b23

到标准输出而不是

 a23
 b23
4

6 回答 6

130

编者注
->(…)是一个进程替换,它是一些POSIX 兼容 shell的非标准 shell 功能: , , . - 这个答案也意外地通过管道发送了 输出过程替换的输出:。 - 进程替换的输出将不可预测地交错,并且,除了在 中,管道可能会在内部命令执行之前终止。bashkshzsh
echo 123 | tee >(tr 1 a) | tr 1 b
zsh>(…)

在 unix(或 mac)中,使用tee命令

$ echo 123 | tee >(tr 1 a) >(tr 1 b) >/dev/null
b23
a23

通常您会使用tee将输出重定向到多个文件,但使用 >(...) 您可以重定向到另一个进程。所以,一般来说,

$ proc1 | tee >(proc2) ... >(procN-1) >(procN) >/dev/null

会做你想做的。

在 windows 下,我认为内置的 shell 没有等价物。微软的Windows PowerShell有一个tee命令。

于 2008-09-13T22:37:53.527 回答
22
于 2008-10-10T10:43:40.147 回答
11

Unix (bash, ksh, zsh)

dF.'s answer contains the seed of an answer based on tee and output process substitutions
(>(...)) that may or may not work, depending on your requirements:

Note that process substitutions are a nonstandard feature that (mostly) POSIX-features-only shells such as dash (which acts as /bin/sh on Ubuntu, for instance), do not support. Shell scripts targeting /bin/sh should not rely on them.

echo 123 | tee >(tr 1 a) >(tr 1 b) >/dev/null

The pitfalls of this approach are:

  • unpredictable, asynchronous output behavior: the output streams from the commands inside the output process substitutions >(...) interleave in unpredictable ways.

  • In bash and ksh (as opposed to zsh - but see exception below):

    • output may arrive after the command has finished.
    • subsequent commands may start executing before the commands in the process substitutions have finished - bash and ksh do not wait for the output process substitution-spawned processes to finish, at least by default.
    • jmb puts it well in a comment on dF.'s answer:

be aware that the commands started inside >(...) are dissociated from the original shell, and you can't easily determine when they finish; the tee will finish after writing everything, but the substituted processes will still be consuming the data from various buffers in the kernel and file I/O, plus whatever time is taken by their internal handling of data. You can encounter race conditions if your outer shell then goes on to rely on anything produced by the sub-processes.

  • zsh is the only shell that does by default wait for the processes run in the output process substitutions to finish, except if it is stderr that is redirected to one (2> >(...)).

  • ksh (at least as of version 93u+) allows use of argument-less wait to wait for the output process substitution-spawned processes to finish.
    Note that in an interactive session that could result in waiting for any pending background jobs too, however.

  • bash v4.4+ can wait for the most recently launched output process substitution with wait $!, but argument-less wait does not work, making this unsuitable for a command with multiple output process substitutions.

  • However, bash and ksh can be forced to wait by piping the command to | cat, but note that this makes the command run in a subshell. Caveats:

    • ksh (as of ksh 93u+) doesn't support sending stderr to an output process substitution (2> >(...)); such an attempt is silently ignored.

    • While zsh is (commendably) synchronous by default with the (far more common) stdout output process substitutions, even the | cat technique cannot make them synchronous with stderr output process substitutions (2> >(...)).

  • However, even if you ensure synchronous execution, the problem of unpredictably interleaved output remains.

The following command, when run in bash or ksh, illustrates the problematic behaviors (you may have to run it several times to see both symptoms): The AFTER will typically print before output from the output substitutions, and the output from the latter can be interleaved unpredictably.

printf 'line %s\n' {1..30} | tee >(cat -n) >(cat -n) >/dev/null; echo AFTER

In short:

  • Guaranteeing a particular per-command output sequence:

    • Neither bash nor ksh nor zsh support that.
  • Synchronous execution:

    • Doable, except with stderr-sourced output process substitutions:
      • In zsh, they're invariably asynchronous.
      • In ksh, they don't work at all.

If you can live with these limitations, using output process substitutions is a viable option (e.g., if all of them write to separate output files).


Note that tzot's much more cumbersome, but potentially POSIX-compliant solution also exhibits unpredictable output behavior; however, by using wait you can ensure that subsequent commands do not start executing until all background processes have finished.
See bottom for a more robust, synchronous, serialized-output implementation.


The only straightforward bash solution with predictable output behavior is the following, which, however, is prohibitively slow with large input sets, because shell loops are inherently slow.
Also note that this alternates the output lines from the target commands.

while IFS= read -r line; do 
  tr 1 a <<<"$line"
  tr 1 b <<<"$line"
done < <(echo '123')

Unix (using GNU Parallel)

Installing GNU parallel enables a robust solution with serialized (per-command) output that additionally allows parallel execution:

$ echo '123' | parallel --pipe --tee {} ::: 'tr 1 a' 'tr 1 b'
a23
b23

parallel by default ensures that output from the different commands doesn't interleave (this behavior can be modified - see man parallel).

Note: Some Linux distros come with a different parallel utility, which won't work with the command above; use parallel --version to determine which one, if any, you have.


Windows

Jay Bazuzi's helpful answer shows how to do it in PowerShell. That said: his answer is the analog of the looping bash answer above, it will be prohibitively slow with large input sets and also alternates the output lines from the target commands.



bash-based, but otherwise portable Unix solution with synchronous execution and output serialization

The following is a simple, but reasonably robust implementation of the approach presented in tzot's answer that additionally provides:

  • synchronous execution
  • serialized (grouped) output

While not strictly POSIX compliant, because it is a bash script, it should be portable to any Unix platform that has bash.

Note: You can find a more full-fledged implementation released under the MIT license in this Gist.

If you save the code below as script fanout, make it executable and put int your PATH, the command from the question would work as follows:

$ echo 123 | fanout 'tr 1 a' 'tr 1 b'
# tr 1 a
a23
# tr 1 b
b23

fanout script source code:

#!/usr/bin/env bash

# The commands to pipe to, passed as a single string each.
aCmds=( "$@" )

# Create a temp. directory to hold all FIFOs and captured output.
tmpDir="${TMPDIR:-/tmp}/$kTHIS_NAME-$$-$(date +%s)-$RANDOM"
mkdir "$tmpDir" || exit
# Set up a trap that automatically removes the temp dir. when this script
# exits.
trap 'rm -rf "$tmpDir"' EXIT 

# Determine the number padding for the sequential FIFO / output-capture names, 
# so that *alphabetic* sorting, as done by *globbing* is equivalent to
# *numerical* sorting.
maxNdx=$(( $# - 1 ))
fmtString="%0${#maxNdx}d"

# Create the FIFO and output-capture filename arrays
aFifos=() aOutFiles=()
for (( i = 0; i <= maxNdx; ++i )); do
  printf -v suffix "$fmtString" $i
  aFifos[i]="$tmpDir/fifo-$suffix"
  aOutFiles[i]="$tmpDir/out-$suffix"
done

# Create the FIFOs.
mkfifo "${aFifos[@]}" || exit

# Start all commands in the background, each reading from a dedicated FIFO.
for (( i = 0; i <= maxNdx; ++i )); do
  fifo=${aFifos[i]}
  outFile=${aOutFiles[i]}
  cmd=${aCmds[i]}
  printf '# %s\n' "$cmd" > "$outFile"
  eval "$cmd" < "$fifo" >> "$outFile" &
done

# Now tee stdin to all FIFOs.
tee "${aFifos[@]}" >/dev/null || exit

# Wait for all background processes to finish.
wait

# Print all captured stdout output, grouped by target command, in sequences.
cat "${aOutFiles[@]}"
于 2017-05-11T04:51:11.043 回答
5

由于@dF: 提到 PowerShell 有 tee,我想我会展示一种在 PowerShell 中执行此操作的方法。

PS > "123" | % { 
    $_.Replace( "1", "a"), 
    $_.Replace( "2", "b" ) 
}

a23
1b3

请注意,从第一个命令出来的每个对象都会在创建下一个对象之前进行处理。这可以允许缩放到非常大的输入。

于 2008-09-14T23:16:45.080 回答
1

You can also save the output in a variable and use that for the other processes:

out=$(proc1); echo "$out" | proc2; echo "$out" | proc3

However, that works only if

  1. proc1 terminates at some point :-)
  2. proc1 doesn't produce too much output (don't know what the limits are there but it's probably your RAM)

But it is easy to remember and leaves you with more options on the output you get from the processes you spawned there, e. g.:

out=$(proc1); echo $(echo "$out" | proc2) / $(echo "$out" | proc3) | bc

I had difficulties doing something like that with the | tee >(proc2) >(proc3) >/dev/null approach.

于 2020-06-04T06:37:01.980 回答
-1

another way to do would be,

 eval `echo '&& echo 123 |'{'tr 1 a','tr 1 b'} | sed -n 's/^&&//gp'`

output:

a23
b23

no need to create a subshell here

于 2013-05-14T11:23:48.783 回答