bash - 将输出写入shell中的多个文件

Question

我在 File_A 中有 135 个文档存储为 135 行（所以每行都是一个长文本），在 File_B 中有 15 个短语。我需要从 File_A 中提取一个句子及其之前的句子，并在 File_B 中使用匹配的短语。从 File_A-Line_1 中提取的句子应输出到新文件 File_1。同样，从 File_A-Line_2 提取的句子应该输出到一个新文件 File_2 等等，直到我从所有行中提取匹配的句子。我用下面的代码做到了这一点

i=1
while read line; do
 while read row; do
   cat "$line" | sed 's/\./.\n/g' | grep -i -B 1 "$row"  | tr -d '\n' |  sed 's/--/\n/g'    >> file_$i
 done < $2 
 $i = $i+1;
done < $1

这里的问题是，输出被打印到控制台而不是新文件。有人可以帮助我意识到我的错误。

谢谢

score 1 · Accepted Answer

修复前面提到的问题（重新递增i和误用cat）会导致如下情况。请注意，该行date > file_$i用于调试，以确保每个输出文件在测试开始时都是新的。:操作员是无操作的。该表格<<<引入了“here-doc”。如果的内容$lines是文件名，而不是问题中指定的文档，请<"$lines"使用<<<"$lines".

#!/bin/bash
i=1
while read line; do
    date > file_$i
    while read row; do
    sed 's/\./.\n/g' <<< "$line" | grep -iB1 "$row" | tr -d '\n' |  sed 's/--/\n/g' >> file_$i
    done < $2 
    : $((i++))
done < $1

给定 splitdoc.data 包含以下内容：

This is doc 1.  I am 1 fine.  How are you, 1.? Ok. Hello 1.--  Go away now.
This is doc 2.  I am 2 fine.  How are you, 2.? Ok. Hello 2.--  Go away now.
This is doc 3.  I am 3 fine.  How are you, 3.? Ok. Hello 3.--  Go away now.
This is doc 4.  I am 4 fine.  How are you, 4.? Ok. Hello 4.--  Go away now.

和 splitdoc.tags 具有以下内容：

How are you
Go away now

然后命令

./splitdoc.sh splitdoc.data splitdoc.tags ; head file_*

产生：

==> file_1 <==
Fri Oct 26 19:42:00 MDT 2012
  I am 1 fine.  How are you, 1. Hello 1.
  Go away now.
==> file_2 <==
Fri Oct 26 19:42:00 MDT 2012
  I am 2 fine.  How are you, 2. Hello 2.
  Go away now.
==> file_3 <==
Fri Oct 26 19:42:00 MDT 2012
  I am 3 fine.  How are you, 3. Hello 3.
  Go away now.

score 1 · Accepted Answer

我认为这会奏效

i=1
while read line; do
 while read row; do
   echo "$line" | sed 's/\./.\n/g' | grep -i -B 1 "$row"  | tr -d '\n' |  sed 's/--/\n/g' >> file_$i
 done < $2 
 $i = $i+1;
done < $1 

a=0 
while read line; do 
a=$(($a+1)); 
while read row; do
    echo "$line" | sed 's/\./.\n/g' | grep -i -B 1 "$row" | tr -d '\n' | sed 's/--/\n/g' >> file_$a done < $2 done < $1

score 1 · Accepted Answer

这不是在 shell 中增加变量的方式：

$i = $i + 1

相反，它会尝试运行名称为当前值的命令$i。你要这个：

let i=i+1

或者，更简洁地说，

let i+=1

这可能不是问题，但它是一个问题，它可能导致奇怪的行为。

"$1"我看到的唯一另一件事是您的文件名（，）周围缺少引号"$2"。

此外，如果每一行都是文件名，则不需要cat; 做就是了

<"$line" sed ...

如果每一行都是文件的内容而不是文件名，那cat是完全错误的，因为它试图找到一个文件名是那个大长文本的文件。您可以改用它：

<<<"$line" sed ...

编辑另外，如果 fileB 中没有那么多行，您可能可以避免为 fileA 中列出的每个文件一遍又一遍地读取它。只需一次将所有 fileB 读入内存：

IFS=$'\n' rows=($(<"$2"))
let i=0
while read line; do
  for row in "${rows[@]}"; do
    <<<"$line" sed 's/\./.\n/g' | grep -i -B 1 "$row"  | 
             tr -d '\n' |  sed 's/--/\n/g' >> file_$i
  done 
  let i+=1
done < "$1"

事实上，你甚至可以在一个 grep 中完成：

pat=''
while read row; do
  pat="${pat:+$pat|}$row"
done <"$2"

let i=0
while read line; do
  <<<"$line" sed 's/\./.\n/g' | egrep -i -B 1 "$pat"  | 
             tr -d '\n' |  sed 's/--/\n/g' >"file_$i"
let i+=1
done < "$1"

score 1 · Accepted Answer

这清楚吗？如果没有，请评论它，我会编辑它。Bash 输出重定向示例：

echo "some text" >file.txt;
#here we add on to the end of the file instead of overwriting the file
echo "some additional text" >>file.txt;
#put something in two files and output it
echo "two files and console" | tee file1.txt | tee file2.txt;
#put something in two files and output nothing
echo "just two files" | tee file1.txt >file2.txt;

bash - 将输出写入shell中的多个文件

4 回答 4

Related

Reference