2

我正在尝试删除常规文本周围的单引号。例如,给定列表:

alist = ["'ABC'", '(-inf-0.5]', '(4800-20800]', "'\\'(4.5-inf)\\''", "'\\'(2.75-3.25]\\''"]

我想将“'ABC'”变成“ABC”,但保留其他引号,即:

alist = ["ABC", '(-inf-0.5]', '(4800-20800]', "'\\'(4.5-inf)\\''", "'\\'(2.75-3.25]\\''"]

我尝试使用如下外观:

fixRepeatedQuotes = lambda text: re.sub(r'(?<!\\\'?)\'(?!\\)', r'', text)
print [fixRepeatedQuotes(str) for str in alist]

但收到错误消息:

sre_constants.error: look-behind requires fixed-width pattern. 

还有其他解决方法吗?提前非常感谢!

4

2 回答 2

1

尝试应该工作:

result = re.sub("""(?s)(?:')([^'"]+)(?:')""", r"\1", subject)

解释

"""
(?:         # Match the regular expression below
   '           # Match the character “'” literally (but the ? makes it a non-capturing group)
)
(           # Match the regular expression below and capture its match into backreference number 1
   [^'"]       # Match a single character NOT present in the list “'"” from this character class (aka any character matches except a single and double quote)
      +           # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
(?:         # Match the regular expression below
   '           # Match the character “'” literally (but the ? makes it a non-capturing group)
)
"""
于 2012-08-09T04:21:55.840 回答
1

re.sub接受一个函数作为替换文本。所以,

re.sub(r"'([A-Za-z]+)'", lambda match: match.group(), "'ABC'")

产量

"ABC"
于 2012-08-09T04:25:57.833 回答