python - 两个速度/效率问题：'For x in list' vs 'for x in xrange(len(list))'，字符串匹配 re vs ==

Question

关于 Python 中速度/效率/最佳实践的两个问题。以下哪一项是“更好”的（更快、更少的内存密集型等）：

for x in list:
    #do something to x

或者

for x in xrange(len(list)):
    #do something to list[x]

for string in list_of_strings:
    for string2 in other_string_list:
        if string == string2:
            #do something

或者

import re
for string in list_of_strings:
    if re.match('%s'%(string),other_strings): #or re.search(etc)
         #do something

不是很紧迫，我主要是好奇。我想我可以使用 timeit() 或其他什么方法获得某种原始数据，但我会欣赏更多的深度，而不仅仅是“这个比你的计算机上的那个快”。

score 2 · Accepted Answer

你真的无法比较这些。

for x in mylist:
    # do something to x

是通常的成语，但无论你做什么x都不会影响mylist（除非x是可变的）。如果您的目标是mylist在迭代期间进行修改，那么

for x in xrange(len(list)):
    #do something to list[x]

几乎总是不好的形式。更好的方法是使用

for i, x in enumerate(mylist):
    # now you can work with x and/or change mylist[i] if you need to

但通常，使用列表推导式或生成器表达式会更好：

newlist = [foo(item) for item in mylist if bar(item)]

这一切都取决于您的用例。

至于你的第二个问题，使用正则表达式进行纯字符串相等比较是过大的。嵌套两个 for 循环也很糟糕：

for string in one_list:
    if string in other_list:
        # do something

会好一点，但我很确定如果再次了解您的实际用例，那可以改进。

1 回答 1