3

例如,假设我想计算 10 个 BIG 文件的行数并打印一个总数。

for f in files
do
    #this creates a background process for each file
    wc -l $f | awk '{print $1}' &
done

我正在尝试类似的东西:

for f in files
do
    #this does not work :/
    n=$( expr $(wc -l $f | awk '{print $1}') + $n ) &
done

echo $n
4

3 回答 3

3

我终于找到了一个使用匿名管道和 bash 的可行解决方案:

#!/bin/bash

# this executes a separate shell and opens a new pipe, where the 
# reading endpoint is fd 3 in our shell and the writing endpoint
# stdout of the other process. Note that you don't need the 
# background operator (&) as exec starts a completely independent process.
exec 3< <(./a.sh 2&1)


# ... do other stuff


# write the contents of the pipe to a variable. If the other process
# hasn't already terminated, cat will block.
output=$(cat <&3)
于 2013-08-09T00:04:43.753 回答
1

您可能应该使用 gnu 并行:

find . -maxdepth 1 -type f | parallel --gnu 'wc -l' | awk 'BEGIN {n=0} {n += $1} END {print n}'

否则 xargs 在并行模式下:

find . -maxdepth 1 -type f | xargs -n1 -P4 wc -l | awk 'BEGIN {n=0} {n += $1} END {print n}'

如果这不符合您的需要,另一种选择是写入临时文件。如果您不想写入磁盘,只需写入 /dev/shm。这是大多数 Linux 系统上的 ramdisk。

#!/bin/bash

declare -a temp_files

count=0
for f in *
do
  if [[ -f "$f" ]]; then
    temp_files[$count]="$(mktemp /dev/shm/${f}-XXXXXX)"
    ((count++))
  fi
done

count=0
for f in *
do
  if [[ -f "$f" ]]; then
    cat "$f" | wc -l > "${temp_files[$count]}" &
    ((count++))
  fi
done

wait

cat "${temp_files[@]}" | awk 'BEGIN {n=0} {n += $1} END {print n}'

for tf in "${temp_files[@]}"
do
  rm "$tf"
done

顺便说一句,这可以看作是一个 map-reduce,wc 进行映射,awk 进行归约。

于 2013-08-09T05:20:51.037 回答
0

您可以将其写入文件或更好,一旦数据到达就收听fifo。

这是一个关于它们如何工作的小例子:

# create the fifo
mkfifo test

# listen to it
while true; do if read line <test; then echo $line; fi done

# in another shell 
echo 'hi there'

# notice 'hi there' being printed in the first shell

所以你可以

for f in files
do
    #this creates a background process for each file
    wc -l $f | awk '{print $1}' > fifo &
done

并在 fifo 上收听尺寸。

于 2013-08-09T00:13:30.397 回答