I saw this as an opportunity to try using Parse::RecDescent. I don't understand these things very well, so there might have been a better way to write the grammar.
The parser allows me to generate a list of sets of phrases to use. Then, I feed that list of sets to Set::CrossProduct to generate the Cartesian product of sets.
#!/usr/bin/env perl
use strict;
use warnings;
use Parse::RecDescent;
use Set::CrossProduct;
our @list;
my $parser = Parse::RecDescent->new(q{
List: OptionalPhrase |
AlternatingMandatoryPhrases |
FixedPhrase
OptionalPhrase:
OptionalPhraseStart
OptionalPhraseContent
OptionalPhraseEnd
OptionalPhraseStart: /\\[/
OptionalPhraseContent: /[^\\]]+/
{
push @::list, [ $item[-1], '' ];
}
OptionalPhraseEnd: /\\]/
AlternatingMandatoryPhrases:
AlternatingMandatoryPhrasesStart
AlternatingMandatoryPhrasesContent
AlternatingMandatoryPhraseEnd
AlternatingMandatoryPhrasesStart: /\\(/
AlternatingMandatoryPhrasesContent: /[^|)]+(?:[|][^|)]+)*/
{
push @::list, [ split /[|]/, $item[-1] ];
}
AlternatingMandatoryPhraseEnd: /\\)/
FixedPhrase: /[^\\[\\]()]+/
{
$item[-1] =~ s/\\A\\s+//;
$item[-1] =~ s/\s+\z//;
push @::list, [ $item[-1] ];
}
});
my $words = "[a] (good|bad) word [for fun]";
1 while defined $parser->List(\$words);
my $iterator = Set::CrossProduct->new(\@list);
while (my $next = $iterator->get) {
print join(' ', grep length, @$next), "\n";
}
Output:
a good word for fun
a good word
a bad word for fun
a bad word
good word for fun
good word
bad word for fun
bad word