我有一些我认为应该返回不在字符串中的python语句的所有部分的代码。但是,我不确定这是否像我想要的那样严格。基本上,它只是找到下一个字符串分隔符并保持“字符串”状态,直到它被相同的分隔符关闭。我为一些我没有想到的奇怪案例所做的事情有什么问题吗?它会以任何方式与 python 所做的不一致吗?
# String delimiters in order of precedence
string_delims = ["'''",'"""',"'",'"']
# Get non string parts of a statement
def get_non_string(text):
out = ""
state = None
while True:
# not in string
if state == None:
vals = [text.find(s) for s in string_delims]
# None will only be reached if all are -1 (i.e. no substring)
for val,delim in zip(vals+[None], string_delims+[None]):
if val == None:
out += text
return out
if val >= 0:
i = val
state = delim
break
out += text[:i]
text = text[i+len(delim):]
else:
i = text.find(state)
if i < 0:
raise SyntaxError("Symobolic Subsystem: EOL while scanning string literal")
text = text[i+len(delim)]
state = None
示例输入:
get_non_string("hello'''everyone'''!' :)'''")
示例输出:
hello!