它可能并不完美,但这似乎可以满足您的用例。我觉得这是一个相当复杂的过程,并且有了更多规则,它将开始处理正则表达式真正不擅长的问题类型。
>>> import re
>>> match_dict = {'hello(here)': 'here',
... 'Hello (Hi)': 'Hi',
... "'dfsfds Hello (Hi) fdfd' Hello (Yes)": 'Yes',
... "Hello ('hi)xx')": "hi)xx",
... "Hello ('Hi')": 'Hi'}
>>> for s, goal in match_dict.iteritems():
... print "INPUT: %s" % s
... print "GOAL: %s" % goal
... m = re.sub(r"(?<!\()'[^']+'", '', s, flags=re.I|re.M)
... paren_quotes = re.findall(r"hello\s*\('([^']+)'\)", m, flags=re.I|re.M)
... output = paren_quotes if paren_quotes else []
... m = re.sub(r"hello\s*\('[^']+'\)", '', m, flags=re.I|re.M)
... paren_matches = re.findall(r"hello\s*\(([^)]+)\)", m, flags=re.I|re.M)
... if paren_matches:
... output.extend(paren_matches)
... print 'OUTPUT: %s\n' % output
...
INPUT: 'dfsfds Hello (Hi) fdfd' Hello (Yes)
GOAL: Yes
OUTPUT: ['Yes']
INPUT: Hello ('Hi')
GOAL: Hi
OUTPUT: ['Hi']
INPUT: hello(here)
GOAL: here
OUTPUT: ['here']
INPUT: Hello (Hi)
GOAL: Hi
OUTPUT: ['Hi']
INPUT: Hello ('hi)xx')
GOAL: hi)xx
OUTPUT: ['hi)xx']