-3

我有以下网址:

文件1.txt

http://www.stackoveflow.com/dog/cat/rabbit/hamster/
192.168.192.168/lion/tiger/elephant/

文件2.txt

HELLO
GOODBYE

我试图实现的输出:

http://www.stackoveflow.com/dogHELLO/cat/rabbit/hamster/
http://www.stackoveflow.com/dog/catHELLO/rabbit/hamster/
http://www.stackoveflow.com/dog/cat/rabbitHELLO/hamster/
http://www.stackoveflow.com/dog/cat/rabbit/hamsterHELLO/
http://www.stackoveflow.com/dog/cat/rabbit/hamster/HELLO

http://www.stackoveflow.com/dogGOODBYE/cat/rabbit/hamster/
http://www.stackoveflow.com/dog/catGOODBYE/rabbit/hamster/
http://www.stackoveflow.com/dog/cat/rabbitGOODBYE/hamster/
http://www.stackoveflow.com/dog/cat/rabbit/hamsterGOODBYE/
http://www.stackoveflow.com/dog/cat/rabbit/hamster/GOODBYE

192.168.192.168/lionHELLO/tiger/elephant/
192.168.192.168/lion/tigerHELLO/elephant/
192.168.192.168/lion/tiger/elephantHELLO/
192.168.192.168/lion/tiger/elephant/HELLO

192.168.192.168/lionGOODBYE/tiger/elephant/
192.168.192.168/lion/tigerGOODBYE/elephant/
192.168.192.168/lion/tiger/elephantGOODBYE/
192.168.192.168/lion/tiger/elephant/GOODBYE

如您所见,字符串HELLOGOODBYE在每个斜线之后插入,如果斜线之后已经有一个字符串,它将在其后附加HELLOand GOODBYE(例如http://www.stackoveflow.com/dogHELLO/cat/rabbit/hamster/,依此类推)。

我试过的

use strict;
use warnings;

my @f1 = do {
   open my $fh, '<', 'FILE1.txt';
   <$fh>;
};
chomp @f1;

my @f2 = do {
  open my $fh, '<', 'FILE2.txt';
  <$fh>;
};
chomp @f2;

for my $f1 (@f1) {
  my @fields = $f1 =~ m{[^/]+}g;
  for my $f2 (@f2) {
    for my $i (0 .. $#fields) {
      my @new = @fields;
      $new[$i] .= $f2;
      print qq{/$_/\n}, for join '/', @new;
    }
    print "\n\n";
  }
}
#courtesy of Borodin

但是,此代码不适合在部分中包含斜杠的 url,因为它们在不应该做的时候http://被替换。http:HELLO/

如果已经没有字符串,它也不会放在斜杠之后HELLO或斜杠之后,例如GOODBYEhttp://www.stackoveflow.com/dog/cat/rabbit/hamster/<--SHOULD PUT HELLO AFTER THIS SLASH AS WELL BUT DOSN'T

看来此代码删除然后重新插入带有 FILE2.txt 字符串的斜杠,而不是插入HELLOGOODBYE在正确的位置开始。

我的问题

是否有更好的方法来实现我需要的输出,或者我可以对现有代码做些什么来解决上述问题?

非常感谢您的帮助,非常感谢

4

3 回答 3

2

这是散文中的算法:

Open File2.txt. Read in all lines, removing the newline. We call the array @words.

Open File2.txt. We call the file handle $fh.

As long as we can read a $line from $fh:

    Remove the newline, remove starting and ending slashes.
    Split the $line at every slash, call the array @animals.

    Loop through the @words, calling each element $word:

        Loop through the indices of the @animals, calling each index $i:

            Make a @copy of the @animals.
            Append the $word to the $i-th element of @copy.
            Join the @copy with slashes, surround it with slashes, and print with newline.

        Print an empty line.
于 2013-02-26T13:04:32.493 回答
2

该程序将按照您的要求进行。

use strict;
use warnings;
use autodie;

my @f1 = do {
  open my $fh, '<', 'FILE1.txt';
  <$fh>;
};
chomp @f1;

my @f2 = do {
  open my $fh, '<', 'FILE2.txt';
  <$fh>;
};
chomp @f2;

for my $f1 (@f1) {
  my @fields = $f1 =~ m{[^/]+}g;
  for my $f2 (@f2) {
    for my $i (0 .. $#fields) {
      my @new = @fields;
      $new[$i] .= $f2;
      print qq{/$_/\n}, for join '/', @new;
    }
    print "\n\n";
  }
}

输出

/dogHELLO/cat/rabbit/hamster/
/dog/catHELLO/rabbit/hamster/
/dog/cat/rabbitHELLO/hamster/
/dog/cat/rabbit/hamsterHELLO/


/dogGOODBYE/cat/rabbit/hamster/
/dog/catGOODBYE/rabbit/hamster/
/dog/cat/rabbitGOODBYE/hamster/
/dog/cat/rabbit/hamsterGOODBYE/


/lionHELLO/tiger/elephant/
/lion/tigerHELLO/elephant/
/lion/tiger/elephantHELLO/


/lionGOODBYE/tiger/elephant/
/lion/tigerGOODBYE/elephant/
/lion/tiger/elephantGOODBYE/
于 2013-02-26T13:11:49.590 回答
0

您可以使用正则表达式完成所有操作,而不是在每个斜杠上分割线。

更新后的版本:

#!usr/bin/perl
use strict;
use warnings;

my @insert_words = qw/HELLO GOODBYE/;
my $word = 0;

while (<DATA>)
{
    chomp;
    foreach my $word (@insert_words)
    {
        my $repeat = 1;
        while ((my $match=$_) =~ s|(?<!/)(?:/(?!/)[^/]*){$repeat}[^/]*\K|$word|)
        {
            print "$match\n";
            $repeat++;
        }
        print "\n";
    }
}

__DATA__
/dog/cat/rabbit/hamster/
http://www.stackoverflow.com/dog/cat/rabbit/hamster/

关键是替换运算符:s|(?<!/)(?:/(?!/)[^/]*){$repeat}[^/]*\K|$word|.

(?<!/)并且(?!/)分别是负后瞻和前瞻。他们确保我们只匹配一个/,从而忽略http://.

(?:/(?!/)[^/]*){$repeat}是一个必须匹配指定次数的捕获组,我们增加该次数直到它不再匹配。

我不得不使用[^/]*而不是[^/]+满足您在字符串末尾匹配的要求。这就是为什么需要后视和前瞻的原因。

\K意思是“匹配到目前为止的所有内容,但不要将其包含在匹配本身中。” 因此,我们不必担心在替换中包含匹配的字符串的整个开头。

注意:r选项是在不修改原始字符串的情况下执行替换的另一种方式。但是,它需要 Perl 5.16(感谢 Amon)。因此,我将其从示例中删除。

于 2013-02-26T13:10:41.873 回答