我是一个python/编程新手,已经做了几个月了。希望这段代码对于 SO 来说不是太大或过分,但我不知道在没有完整上下文的情况下如何问这个问题。所以这里是:
import re
import itertools
nouns = ['bacon', 'cheese', 'eggs', 'milk', 'fish', 'houses', 'dog']
CC = ['and', 'or']
def replacer_factory():
def create_permutations(match):
group1_string = (match.group(1)[:-1]) # strips trailing whitespace
# creates list of matched.group() with word 'and' or 'or' removed
nouns2 = filter(None, re.split(r',\s*', group1_string)) + [match.group(3)]
perm_nouns2 = list(itertools.permutations(nouns2))
CC_match = match.group(2) # this either matches word 'and' or 'or'
# create list that holds the permutations created in for loop below
perm_list = []
for comb in itertools.permutations(nouns2):
comb_len = len(comb)
if comb_len == 2:
perm_list.append(' '.join((comb[0], CC_match, comb[-1])))
elif comb_len == 3:
perm_list.append(', '.join((comb[0], comb[1], CC_match, comb[-1])))
elif comb_len == 4:
perm_list.append(', '.join((comb[0], comb[1], comb[2], CC_match, comb[-1])))
# does the match.group contain word 'and' or 'or'
if (match.group(2)) == "and":
joined = '*'.join(perm_list)
strip_comma = joined.replace("and,", "and")
completed = '|'+strip_comma+'|'
return completed
elif (match.group(2)) == "or":
joined = '*'.join(perm_list)
strip_comma = joined.replace("or,", "or")
completed = '|'+strip_comma+'|'
return completed
return create_permutations
def search_and_replace(text):
# use'nouns' and 'CC' lists to find a noun list phrase
# e.g 'bacon, eggs, and milk' is 1 example of a match
noun_patt = r'\b(?:' + '|'.join(nouns) + r')\b'
CC_patt = r'\b(' + '|'.join(CC) + r')\b'
patt = r'((?:{0},? )+){1} ({0})'.format(noun_patt, CC_patt)
replacer = replacer_factory()
return re.sub(patt, replacer, text)
def main():
with open('test_sentence.txt') as input_f:
read_f = input_f.read()
with open('output.txt', 'w') as output_f:
output_f.write(search_and_replace(read_f))
if __name__ == '__main__':
main()
'test_sentence.txt' 的内容:
I am 2 list with 'or': eggs or cheese.
I am 2 list with 'and': milk and eggs.
I am 3 list with 'or': cheese, bacon, and eggs.
I am 3 list with 'and': bacon, milk and cheese.
I am 4 list: milk, bacon, eggs, and cheese.
I am 5 list, I don't match.
I am 3 list with non match noun: cheese, bacon and pie.
所以,代码都很好用,但我遇到了一个我不知道如何解决的限制。这个限制包含在 for 循环中。就目前而言,我只创建了 'if' 和 'elif' 语句,它们仅能达到elif comb == 4:
. 我实际上希望它成为无限的,继续前进到elif comb == 5:
, elif comb == 6:
, elif comb == 7:
。(好吧,在实际现实中,我真的不需要超越elif comb == 20
,但重点是一样的,我想考虑这种可能性)。但是,创建这么多“elif”语句是不切实际的。
关于如何解决这个问题的任何想法?
请注意,此处的“test_sentence.txt”和变量“名词”列表只是示例。我实际上的“名词”列表有 1000 个,我将处理文本包含在“test_sentence.txt”中的更大文档。
干杯达伦
PS - 我努力想出一个合适的标题!