1
4

1 回答 1

1

This regex should deal with the cases you are looking for (including matching the longest possible pattern in the first column):

^(\S+)(\S*?)\s+?(\S*?(\1)\S*?)$

Regex demo here.

You can then go on to use the match groups to make the specific replacement you are looking for. Here is an example solution in python:

import re

regex = re.compile(r'^(\S+)(\S*?)\s+?(\S*?(\1)\S*?)$')

with open('output.txt', 'w', encoding='utf-8') as fout:
    with open('file.txt', 'r', encoding='utf-8') as fin:
        for line in fin:
            match = regex.match(line)
            if match:
                hint = match.group(3).replace(match.group(1), '{...}')
                output = '{0}\t{1}\n'.format(match.group(1) + match.group(2), hint)
                fout.write(output)

Python demo here.

于 2015-07-23T12:07:14.527 回答