我知道这个问题已经得到解答,但只是timeit
指出关于解决方案效率的内容。使用这些参数:
size = 30
s = [str(random.randint(0, 9)) for i in range(size)] + (size/3) * ['-']
random.shuffle(s)
s = ''.join(['+'] + s)
timec = 1000
即“电话号码”有 30 位数字,1 加 sing 和 10 '-'。我已经测试了这些方法:
def justdigits(s):
justdigitsres = ""
for char in s:
if char.isdigit():
justdigitsres += str(char)
return justdigitsres
re_compiled = re.compile(r'\D')
print('Filter: %ss' % timeit.Timer(lambda : ''.join(filter(str.isdigit, s))).timeit(timec))
print('GE: %ss' % timeit.Timer(lambda : ''.join(n for n in s if n.isdigit())).timeit(timec))
print('LC: %ss' % timeit.Timer(lambda : ''.join([n for n in s if n.isdigit()])).timeit(timec))
print('For loop: %ss' % timeit.Timer(lambda : justdigits(s)).timeit(timec))
print('RE: %ss' % timeit.Timer(lambda : re.sub(r'\D', '', s)).timeit(timec))
print('REC: %ss' % timeit.Timer(lambda : re_compiled.sub('', s)).timeit(timec))
print('Translate: %ss' % timeit.Timer(lambda : s.translate(None, '+-')).timeit(timec))
并得出了以下结果:
Filter: 0.0145790576935s
GE: 0.0185861587524s
LC: 0.0151798725128s
For loop: 0.0242128372192s
RE: 0.0120108127594s
REC: 0.00868797302246s
Translate: 0.00118899345398s
显然 GE 和 LC 仍然比正则表达式或编译的正则表达式慢。显然我的 CPython 2.6.6 并没有优化字符串添加。translate
似乎是最快的(这是预期的,因为问题被表述为“忽略这两个符号”,而不是“获取这些数字”,我相信这是相当低级的)。
对于size = 100
:
Filter: 0.0357120037079s
GE: 0.0465779304504s
LC: 0.0428011417389s
For loop: 0.0733139514923s
RE: 0.0213229656219s
REC: 0.0103371143341s
Translate: 0.000978946685791s
对于size = 1000
:
Filter: 0.212141036987s
GE: 0.198996067047s
LC: 0.196880102158s
For loop: 0.365696907043s
RE: 0.0880808830261s
REC: 0.086804151535s
Translate: 0.00587010383606s