我需要做一次类似的事情,除了我希望它总是“更喜欢”使用双引号——这意味着使用它们,除非字符串中的双引号多于单引号(以尽量减少需要转义的数量) .
我这样做的方法是继承内置str
类并覆盖它的__repr__()
方法。您可能很容易将其中的逻辑颠倒过来做相反的事情(以及强制使用的角色始终是一个或另一个)。
FWIW,这是代码:
# -*- coding: iso-8859-1 -*-
# Special string subclass to override the default
# representation method. Main purpose is to
# prefer using double quotes and avoid hex
# representation on chars with an ord() > 128
class MsgStr(str):
def __repr__(self):
# use double quotes unless there are more of them in the string than
# single quotes
quotechar = '"' if self.count("'") >= self.count('"') else "'"
rep = [quotechar]
for ch in self:
# control char?
if ord(ch) < ord(' '):
# remove the single quotes around the escaped representation
rep += repr(str(ch)).strip("'")
# does embedded quote match quotechar being used?
elif ch == quotechar:
rep += "\\"
rep += ch
# else just use others as they are
else:
rep += ch
rep += quotechar
return "".join(rep)
if __name__ == "__main__":
s1 = '\tWürttemberg'
s2 = MsgStr(s1)
print "str s1:", s1
print "MsgStr s2:", s2
print "--only the next two should differ--"
print "repr(s1):", repr(s1), "# uses built-in string 'repr'"
print "repr(s2):", repr(s2), "# uses custom MsgStr 'repr'"
print "str(s1):", str(s1)
print "str(s2):", str(s2)
print "repr(str(s1)):", repr(str(s1))
print "repr(str(s2)):", repr(str(s2))
print "MsgStr(repr(MsgStr('\tWürttemberg'))):", MsgStr(repr(MsgStr('\tWürttemberg')))
assert eval(MsgStr(repr(MsgStr('\tWürttemberg')))) == MsgStr('\tWürttemberg')
输出:
str s1: Württemberg
MsgStr s2: Württemberg
--only the next two should differ--
repr(s1): '\tW\xfcrttemberg' # uses built-in string 'repr'
repr(s2): "\tWürttemberg" # uses custom MsgStr 'repr'
str(s1): Württemberg
str(s2): Württemberg
repr(str(s1)): '\tW\xfcrttemberg'
repr(str(s2)): '\tW\xfcrttemberg'
MsgStr(repr(MsgStr(' Württemberg'))): "\tWürttemberg"