我想要以下功能:
def get_pattern_and_replacement(the_input, output):
"""
Given the_input and output returns the pattern for matching more general case of the_input and a template string for generating the desired output.
>>> get_pattern_and_replacement("You're not being nice to me.", "I want to be treated nicely.")
("You're not being (?P<word>\w+) to me.", "I want to be treated {{ word }}ly.")
>>> get_pattern_and_replacement("You're not meeting my needs.", "I want my needs met.")
("You're not meeting my (?P<word>\w+).", "I want my {{ word }} met.")
"""
这是为了让程序将不需要的文本转换为所需的文本。
在 Stackoverflow 用户的帮助下,我的功能现在是:
def flatten(nested_list):
return [item for sublist in nested_list for item in sublist]
def get_pattern_and_replacement(the_input, output):
"""
Given the_input and output returns the pattern for matching more general case of the_input and a template string for generating the desired output.
>>> get_pattern_and_replacement("You're not being nice to me.", "I want to be treated nicely.")
("You're not being (?P<word>\w+) to me.", "I want to be treated {{ word }}ly.")
>>> get_pattern_and_replacement("You're not meeting my needs.", "I want my needs met.")
("You're not meeting my (?P<word>\w+).", "I want my {{ word }} met.")
"""
input_set = set(flatten([[the_input[i: i + j] for i in range(len(the_input) - j) if not ' ' in the_input[i: i + j]] for j in range(3, 12)]))
output_set = set(flatten([[output[i: i + j] for i in range(len(the_input) - j) if not ' ' in output[i: i + j]] for j in range(3, 12)]))
intersection = input_set & output_set
intersection = list(intersection)
intersection = sorted(intersection, key=lambda x: len(x))[::-1]
print intersection
pattern = the_input.replace(intersection[0], '(?P<word>\w+)')
replacement = output.replace(intersection[0], '{{ word }}')
return (pattern, replacement)