1

I have a list of sentences.

I want to deal with duplicates like this:

  • White shoes women
  • Shoes women white
  • Women white shoes

I want to make into this:

  • White shoes women

Can I do this in Notepad++?

Or maybe some other software?

4

2 回答 2

1

我不认为你可以在 Npp 中做这样的工作。

这是一种使用 perl 完成工作的方法,它保持第一行的大小写和顺序。
(感谢@jwpfox 提供输入示例)。

use Modern::Perl;

my $prev = '';
while(<DATA>) {
    chomp;
    my $str = join'',sort split' ',lc$_;
    say $_ if $str ne $prev;
    $prev = $str;
}

__DATA__
White shoes women
Shoes women white
Women white shoes
White shoes women
Shoes women white
Women white shoes
Men black boots
Black boots men
Boots men black
girl yellow shirt
yellow girl shirt
pants blue boy

输出:

White shoes women
Men black boots
girl yellow shirt
pants blue boy

PHP中的一个版本:

$s = array(
'White shoes women',
'Shoes women white',
'Women white shoes',
'White shoes women',
'Shoes women white',
'Women white shoes',
'Men black boots',
'Black boots men',
'Boots men black',
'girl yellow shirt',
'yellow girl shirt',
'pants blue boy');

$prev = '';
foreach($s as $line) {
    $list = explode(' ', strtolower($line));
    sort($list);
    $str = implode('',$list);
    if ($str != $prev) echo $line,"\n";
    $prev = $str;
}

输出:

White shoes women
Men black boots
girl yellow shirt
pants blue boy
于 2016-12-14T11:32:36.600 回答
0

使用“其他一些软件”选项。

文件内容input.txt

White shoes women
Shoes women white
Women white shoes
Men black boots
Black boots men
Boots men black
girl yellow shirt
yellow girl shirt
pants blue boy

蟒蛇 3:

sentences = []

with open('input.txt', mode='r') as infile:
    for line in infile:
        wordlist = line.split(' ')
        words = []
        for word in wordlist:
            word = word.strip()
            words.append(word.lower())

        words.sort()

        if words not in sentences:
            sentences.append(words)

with open('output.txt', mode='w') as outfile:
    for sentence in sentences:
        for word in sentence:
            outfile.write(word + ' ')
        outfile.write('\n')

文件内容output.txt

shoes white women 
black boots men 
girl shirt yellow 
blue boy pants 
于 2016-12-14T10:30:18.447 回答