perl - 匹配的行（带有正则表达式）被写入两个输出文件，但它应该只被写入一个输出文件..

Question

我有一个包含多行的制表符分隔的文本文件。我编写了一个脚本，在其中将行分配给一个数组，然后我通过正则表达式在数组中搜索，以找到符合某些条件的行。找到匹配项后，我将其写入 Output1。在遍历所有列出的 if 语句（正则表达式）并且仍然不满足条件后，该行被写入输出 2。

在匹配标准和写入输出 1 时，我 100% 工作，但这是我的问题所在：匹配的行也被写入输出 2，以及不匹配的行。我可能犯了一个愚蠢的错误，但我真的看不到它。如果有人可以看看并帮助我，我会非常感激..

非常感谢！:)

Inputfile sample:
skool   school
losieshuis  pension
prys    prijs
eeu    eeuw
lys lijs
water   water
outoritêr   outoritaire


#!/usr/bin/perl-w
use strict;
use warnings;
use open ':utf8';
use autodie;

open OSWNM, "<SecondWordsNotMatched.txt";
open ONIC, ">Output1NonIdenticalCognates.txt";
open ONC, ">Output2NonCognates.txt";

while (my $line = <OSWNM>)
{
    chomp $line;        
    my @Row = $line;

    for (my $x = 0; $x <= $#Row; $x++)
    {
        my $RowWord = $Row[$x];

#Match: anything, followed by 'y' or 'lê' or 'ê', followed by anything, followed by 
a tab, followed by anything, followed by 'ij' or 'leggen' or 'e', followed by anything

      if ($RowWord =~ /(.*)(y|lê|ê)(.*)(\t)(.*)(ij|leggen|e)(.*)/)
      {
        print ONIC "$RowWord\n";
      }


#Match: anything, followed by 'eeu', followed by 'e' or 's', optional, followed by 
anyhitng, followed by a tab, followed by anything, followed by 'eeuw', followed by 'en', optional

      if ($RowWord =~ /(.*)(eeu)(e|s)?(\t)(.*)(eeuw)(en)?/)
    {
        print ONIC "$RowWord\n";
      }

      else
    {
        print ONC "$RowWord\n";
    }
}
}

score 2 · Accepted Answer

在你的循环中，你基本上有：

if (A) {
  output to file1
}

if (B) {
  output to file1
} else {
  output to file2
}

所以你会output to file2得到任何不满足的东西B（不管是否A满足），并输出同时满足A和B两次满足的东西file1。

如果不打算输出两次，则应将逻辑修改为：

if (A or B) {
  output to file1
} else {
  output to file2
}

或者：

if (A) {
  output to file1
} elsif (B) {
  output to file1
} else {
  output to file2
}

（第二个版本允许您对A和B案例进行不同的处理。）

如果打算使用双重输出，您可以执行以下操作：

my $output_to_file2 = 1;

if (A) {
  output to file1
  $output_to_file2 = 0;
}

if (B) {
  output to file1
  $output_to_file2 = 0;
}

if ($output_to_file2) {
  output to file2
}

perl - 匹配的行（带有正则表达式）被写入两个输出文件，但它应该只被写入一个输出文件..

1 回答 1

Related

Reference