1

我知道比较 2 个文件是一个典型的问题,关于这个问题有很多讨论。但是在处理文本文件时我遇到了一个完全不同的问题:我有两个文本文件,它们的行数可能不同。现在我想比较两个文件并找到不同的行。之后,我想标记两个文件中的所有差异。例如这里是我的文件的内容:

文件 1.txt:

This is the first line.
This line is just appeared in File1.txt.
you can see this line in both files.
this line is also appeared in both files.
this line and,
this one are mereged in File2.txt.

文件2.txt:

This is the first line.
you can see this line in both files.
this line is also appeared in both files.
this line and, this one are mereged in File2.txt.

处理后我希望这两个文件都是这样的:

文件 1.txt:

This is the first line.
<Diff>This line is just appeared in File1.txt.</Diff>
you can see this line in both files.
this line is also appeared in both files.
<Diff>this line and,</Diff>
<Diff>this one are merged in File2.txt.</Diff>

文件2.txt:

This is the first line.
<Diff></Diff>
you can see this line in both files.
this line is also appeared in both files.
<Diff>this line and, this one are mereged in File2.txt.</Diff>
<Diff></Diff>

我怎样才能做到这一点?我知道diff之类的一些工具可以帮助我,但是我怎样才能将它们的结果转换为这种格式呢?

先感谢您。

4

2 回答 2

3

如果您diff从 GNU diffutils 使用,您--old-line-format可以--new-line-format尝试diff.

diff --old-line-format "<Diff></Diff>%c'\012'" \
     --new-line-format "<Diff>%l</Diff>%c'\012'" \
     File1.txt File2.txt > NewFile1.txt

diff --old-line-format "<Diff>%l</Diff>%c'\012'" \
     --new-line-format "<Diff></Diff>%c'\012'" \
     File1.txt File2.txt > NewFile2.txt

有关详细信息,请参阅手册页;搜索“LTYPE-line-format”和“GTYPE-group-format”。

于 2012-06-17T20:16:35.827 回答
3

您可以使用算法::差异。这是一个几乎像您想要的那样产生输出的示例,也许您可​​以对其进行调整以获得您想要的确切输出:

use Algorithm::Diff;
my $diff = Algorithm::Diff->new( \@seq1, \@seq2 );

my @out1;
my @out2;

while(  $diff->Next()  ) {
    if ($diff->Same) {
        push @out1, $diff->Items(1);
        push @out2, $diff->Items(2);
    }
    elsif (not $diff->Items(2) ) {
        for ($diff->Items(1)) {
            chomp;
            push @out1, "<Diff>$_</Diff>\n";
        }
        push @out2, "<Diff></Diff>\n";
    }
    elsif (not $diff->Items(1)) {
        for ($diff->Items(2)) {
            chomp;
            push @out2, "<Diff>$_</Diff>\n";
        }
        push @out1, "<Diff></Diff>\n";
    }
    else {
        for ($diff->Items(1)) {
            chomp;
            push @out1, "<Diff>$_</Diff>\n";
        }
        for ($diff->Items(2)) {
            chomp;
            push @out2, "<Diff>$_</Diff>\n";
        }
    }
}

输出:

@out1:
This is the first line.
<Diff>This line is just appeared in File1.txt.</Diff>
you can see this line in both files.
this line is also appeared in both files.
<Diff>this line and,</Diff>
<Diff>this one are mereged in File2.txt.</Diff>


@out2:
This is the first line.
<Diff></Diff>
you can see this line in both files.
this line is also appeared in both files.
<Diff>this line and, this one are mereged in File2.txt.</Diff>
于 2012-06-17T12:06:42.677 回答