python - Regex Python~Get Everything In Between，如果单词没有用单引号括起来

Question

可能重复：
仅在 Python 中替换未引用的单词

这就是我现在所拥有的

found = re.match( r"hello[(](.*)[)]", word, re.M|re.I)

它会发现：

Hello(here)  and give you "here"

我希望它能够执行以下操作：

Hello  (Hi)

即使两边有空格（但只有空格，而不是其他字符）也返回值，所以这将返回“Hi”

'dfsfds Hello (Hi) fdfd' Hello (Yes)

这将返回“是”，因为第一部分用单引号括起来，所以我们不使用它（如果可能，空格规则在这里仍然适用）

编辑：

 Hello  ('Hi')  would return 'Hi'

score 1 · Accepted Answer

它可能并不完美，但这似乎可以满足您的用例。我觉得这是一个相当复杂的过程，并且有了更多规则，它将开始处理正则表达式真正不擅长的问题类型。

>>> import re
>>> match_dict = {'hello(here)': 'here',
...                 'Hello   (Hi)': 'Hi',
...                 "'dfsfds Hello (Hi) fdfd' Hello (Yes)": 'Yes',
...                 "Hello ('hi)xx')": "hi)xx",
...                 "Hello  ('Hi')": 'Hi'}
>>> for s, goal in match_dict.iteritems():
...     print "INPUT: %s" % s
...     print "GOAL: %s" % goal
...     m = re.sub(r"(?<!\()'[^']+'", '', s, flags=re.I|re.M)
...     paren_quotes = re.findall(r"hello\s*\('([^']+)'\)", m, flags=re.I|re.M)
...     output = paren_quotes if paren_quotes else []
...     m = re.sub(r"hello\s*\('[^']+'\)", '', m, flags=re.I|re.M)
...     paren_matches = re.findall(r"hello\s*\(([^)]+)\)", m, flags=re.I|re.M)
...     if paren_matches:
...         output.extend(paren_matches)
...     print 'OUTPUT: %s\n' % output
... 
INPUT: 'dfsfds Hello (Hi) fdfd' Hello (Yes)
GOAL: Yes
OUTPUT: ['Yes']

INPUT: Hello  ('Hi')
GOAL: Hi
OUTPUT: ['Hi']

INPUT: hello(here)
GOAL: here
OUTPUT: ['here']

INPUT: Hello   (Hi)
GOAL: Hi
OUTPUT: ['Hi']

INPUT: Hello ('hi)xx')
GOAL: hi)xx
OUTPUT: ['hi)xx']

score 0 · Accepted Answer

只需先删除单引号内的所有内容：

>>> import re
>>> s = "'dfsfds Hello (Hi) fdfd' Hello (Yes)"
>>> s2 = re.sub(r"'[^']+'", '', s)
>>> re.search(r'hello\s*\(([^)]+)\)', s2, re.I|re.M).group(1)
'Yes'

python - Regex Python~Get Everything In Between，如果单词没有用单引号括起来

2 回答 2

Related

Reference