2

我有两组文本文件。第一套在 AA 文件夹中。第二组在 BB 文件夹中。第一组(AA 文件夹)的 ff.txt 文件的内容如下所示。

Name        number     marks
john            1         60
maria           2         54
samuel          3         62
ben             4         63

如果标记> 60,我想从这个文件中打印第二列(数字)。输出将是 3,4。接下来,读取 BB 文件夹中的 ff.txt 文件并删除包含数字 3,4 的行。

BB 文件夹中的文件如下所示。第二列是数字。

 marks       1      11.824  24.015  41.220  1.00 13.65
 marks       1      13.058  24.521  40.718  1.00 11.82
 marks       3      12.120  13.472  46.317  1.00 10.62
 marks       4      10.343  24.731  47.771  1.00  8.18

我使用了以下代码。此代码适用于一个文件。

gawk 'BEGIN {getline} $3>60{print $2}' AA/ff.txt | while read number; do gawk -v number=$number '$2 != number' BB/ff.txt > /tmp/ff.txt; mv /tmp/ff.txt BB/ff.txt; done

但是当我用多个文件运行这段代码时,我得到了错误。

gawk 'BEGIN {getline} $3>60{print $2}' AA/*.txt | while read number; do gawk -v number=$number '$2 != number' BB/*.txt > /tmp/*.txt; mv /tmp/*.txt BB/*.txt; done

error:-
mv: target `BB/kk.txt' is not a directory

两天前我问过这个问题。请帮我解决这个错误。

4

3 回答 3

1

> /tmp/*.txt并且mv /tmp/*.txt BB/*.txt是错误的。


对于单个文件

awk 'NR>1 && $3>60{print $2}' AA/ff.txt > idx.txt

awk 'NR==FNR{a[$0]; next}; !($2 in a)' idx.txt BB/ff.txt

对于多个文件

awk 'FNR>1 && $3>60{print $2}' AA/*.txt >idx.txt

cat BB/*.txt | awk 'NR==FNR{a[$0]; next}; !($2 in a)' idx.txt -
于 2012-05-13T12:13:07.213 回答
1

这将创建文件夹中所有文件的索引AA并检查文件夹中的所有文件BB

cat AA/*.txt | awk 'FNR==NR { if ($3 > 60) array[$2]; next } !($2 in array)' - BB/*.txt

这将比较两个单独的文件,假设它们在文件夹中具有相同的名称AA并且BB

ls AA/*.txt | sed "s%AA/\(.*\)%awk 'FNR==NR { if (\$3 > 60) array[\$2]; next } !(\$2 in array)' & BB/\1 %" | sh

高温高压

编辑

这应该有帮助:-)

ls AA/*.txt | sed "s%AA/\(.*\)%awk 'FNR==NR { if (\$3 > 60) array[\$2]; next } !(\$2 in array)' & BB/\1 > \1_tmp \&\& mv \1_tmp BB/\1 %" | sh

于 2012-05-13T12:31:22.810 回答
0

一种perl解决方案:

use warnings;
use strict;
use File::Spec;

## Hash to save data to delete from files of BB folder.
## key -> file name.
## value -> string with numbers of second column. They will be
## joined separated with '-...-', like: -2--3--1-. And it will be easier to
## search for them using a regexp.
my %delete;

## Check arguments:
## 1.- They are two.
## 2.- Both are directories.
## 3.- Both have same number of regular files and with identical names.
die qq[Usage: perl $0 <dir_AA> <dir_BB>\n] if
        @ARGV != 2 ||
        grep { ! -d } @ARGV;

{
        my %h;
        for ( glob join q[ ], map { qq[$_/*] } @ARGV ) {
                next unless -f;
                my $file = ( File::Spec->splitpath( $_ ) )[2] or next;
                $h{ $file }++;
        }

        for ( values %h ) {
                if ( $_ != 2 ) {
                        die qq[Different files in both directories\n];
                }
        }
}

## Get files from dir 'AA'. Process them, print to output lines which 
## matches condition and save the information in the %delete hash.
for my $file ( glob( shift . qq[/*] ) ) {
        open my $fh, q[<], $file or do { warn qq[Couldn't open file $file\n]; next };
        $file = ( File::Spec->splitpath( $file ) )[2] or do { 
                warn qq[Couldn't get file name from path\n]; next };
        while ( <$fh> ) {
                next if $. == 1;
                chomp;
                my @f = split;
                next unless @f >= 3;
                if ( $f[ $#f ] > 60 ) {
                        $delete{ $file } .= qq/-$f[1]-/;
                        printf qq[%s\n], $_;
                }
        }
}

## Process files found in dir 'BB'. For each line, print it if not found in
## file from dir 'AA'.
{
        @ARGV  = glob( shift . qq[/*] );
        $^I = q[.bak];
        while ( <> ) {

                ## Sanity check. Shouldn't occur.
                my $filename = ( File::Spec->splitpath( $ARGV ) )[2];
                if ( ! exists $delete{ $filename } ) {
                        close ARGV;
                        next;
                }

                chomp;
                my @f = split;
                if ( $delete{ $filename } =~ m/-$f[1]-/ ) {
                        next;
                }

                printf qq[%s\n], $_;
        }
}

exit 0;

一个测试

假设下一个文件树。命令:

ls -R1

输出:

.:
AA
BB
script.pl

./AA:
ff.txt
gg.txt

./BB:
ff.txt
gg.txt

以及文件的下一个内容。命令:

head AA/*

输出:

==> AA/ff.txt <==
Name        number     marks
john            1         60
maria           2         54
samuel          3         62
ben             4         63
==> AA/gg.txt <==
Name        number     marks
john            1         70
maria           2         54
samuel          3         42
ben             4         33

命令:

head BB/*

输出:

==> BB/ff.txt <==
 marks       1      11.824  24.015  41.220  1.00 13.65
 marks       1      13.058  24.521  40.718  1.00 11.82
 marks       3      12.120  13.472  46.317  1.00 10.62
 marks       4      10.343  24.731  47.771  1.00  8.18
==> BB/gg.txt <==
 marks       1      11.824  24.015  41.220  1.00 13.65
 marks       2      13.058  24.521  40.718  1.00 11.82
 marks       3      12.120  13.472  46.317  1.00 10.62
 marks       4      10.343  24.731  47.771  1.00  8.18

像这样运行脚本:

perl script.pl AA/ BB

通过以下输出到屏幕:

samuel          3         62
ben             4         63
john            1         70

并且BB修改的目录文件如下:

head BB/*

输出:

==> BB/ff.txt <==
 marks       1      11.824  24.015  41.220  1.00 13.65
 marks       1      13.058  24.521  40.718  1.00 11.82

==> BB/gg.txt <==
 marks       2      13.058  24.521  40.718  1.00 11.82
 marks       3      12.120  13.472  46.317  1.00 10.62
 marks       4      10.343  24.731  47.771  1.00  8.18

因此,从ff.txt带有数字的行34已被删除,以及带有数字的行1in gg.txt,它们都比60最后一列大。我认为这就是您想要实现的目标。我希望它有所帮助,虽然不是awk

于 2012-05-13T14:41:04.457 回答