3

我正在阅读一个文件并建立一个列表a2。我想在前两项之后a2从列表中插入 3 行。b

b = ["This is a line", "another line", "and another one"]
a2 = ['a1', 'a2', 'a3']

i = 0
for x, y in map(None, a2[0:2], a2):
    i = i + 1
    if x == y:
        continue
    else:
        for newLine in b:
            a2.insert(i-1, newLine)
            i = i+1
print a2

以上确实给了我预期的结果,['a1', 'a2', 'This is a line', 'another line', 'and another one', 'a3']但是由于我要从巨大的文本文件中构建列表并在其间插入几行,所以我认为我必须使其性能更加直观!

4

3 回答 3

2

怎么样 -

a2[2:2] = b

演示 -

>>> b = ["This is a line", "another line", "and another one"]
>>> a2 = ['a1', 'a2', 'a3']
>>> a2[2:2] = b
>>> a2
['a1', 'a2', 'This is a line', 'another line', 'and another one', 'a3']

我知道的一些方法的时间信息(包括 OP 发布的方法)-

def func1():
    b = ["This is a line", "another line", "and another one"]
    a2 = ['a1', 'a2', 'a3']
    i = 0
    for x, y in map(None, a2[0:2], a2):
        i = i + 1
        if x == y:
            continue
        else:
            for newLine in b:
                a2.insert(i-1, newLine)
                i = i+1
    return a2


def func2():
    b = ["This is a line", "another line", "and another one"]
    a2 = ['a1', 'a2', 'a3']
    a2 = a2[:2] + b + a2[2:]
    return a2

def func3():
    b = ["This is a line", "another line", "and another one"]
    a2 = ['a1', 'a2', 'a3']
    a2[2:2] = b
    return a2


import timeit

print timeit.timeit(func1,number=500000)
print timeit.timeit(func2,number=500000)
print timeit.timeit(func3,number=500000)

结果 -

1.81288409233
0.621006011963
0.341125011444

a具有 100000 个元素和b具有 1000 个元素的时序结果-

def func1():
    global a2
    global b
    i = 0
    for x, y in map(None, a2[0:2], a2):
        i = i + 1
        if x == y:
            continue
        else:
            for newLine in b:
                a2.insert(i-1, newLine)
                i = i+1
            break
    return a2


def func2():
    global a2
    global b
    a2 = a2[:2] + b + a2[2:]
    return a2

def func3():
    global a2
    global b
    a2[2:2] = b
    return a2

def func4():
    global a2
    global b
    a2.reverse()
    b.reverse()
    for i in b:
        a2.insert(-2, i)
    return a2

import timeit

a2 = ['a1' for _ in range(100000)]
b = ['a2' for i in range(1000)]

print timeit.timeit(func1,number=10,setup = 'from __main__ import a2,b')
print timeit.timeit(func2,number=10,setup = 'from __main__ import a2,b')
print timeit.timeit(func3,number=10,setup = 'from __main__ import a2,b')
print timeit.timeit(func4,number=10,setup = 'from __main__ import a2,b')

结果 -

1.00535297394
0.0210499763489
0.001296043396
0.0044310092926

参考时序测试 - https://ideone.com/k4DANI

于 2015-07-25T07:56:13.940 回答
0

I'm giving this answer in a belief that your list a2 is not a fixed size at the beginning and you've to insert all the values of list b into list a2 after index 1.

Generally list.insert() works in this way, if list l1 size is n (imagine that this n is hege) and if you try to add another huge list l2 of values from the beginning lets say from position 2 l1.insert(2, val) this should move all the other elements of list l1 from 2 to n - 1 to next positions for every insert.

We can avoid this by inserting from the ending by reversing both the l1 and l2.

Lets consider your lists l1 , l2 and we've to insert all the l2 values into l1 from the index 2.

>>> l1 = range(1, 10)
>>> l2 = range(10, 20)
>>> l1
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l2
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

After inserting the l2 into l1 with the below way.......

>>> i = 2
>>> for j in l2:
...     l1.insert(i, j)
...     i += 1
>>> l1
[1, 2, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 3, 4, 5, 6, 7, 8, 9]

In the above way of inserting for every insert the values 3, 4, 5, 6, 7, 8, 9 in l1 has been moved to next positions after proper list.resize. Assume that what will heppen if your 'l1' size is of ten million this movement of values becomes a overhead.

To avoid this data movement within the list for every insert, you can insert values from the end, in your case you've to reverse the list l1 and l2 and do l1.insert(-2, l2.val)

>>> l1
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l2
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> l1.reverse()
>>> l2.reverse()
>>> l1
[9, 8, 7, 6, 5, 4, 3, 2, 1]
>>> l2
[19, 18, 17, 16, 15, 14, 13, 12, 11, 10]
>>> for i in l2:
...     l1.insert(-2, i)
... 

After inserting you'll get this ..

>>> l1
[9, 8, 7, 6, 5, 4, 3, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 2, 1]

If you observe the data movement happened in this way of inserting, only values 2, 1 has been moved all the time while inserting the values of l2.

You can simply reverse the l1 to get the desired list of values.

>>> l1.reverse()
>>> l1
[1, 2, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 3, 4, 5, 6, 7, 8, 9]

in this way we can avoid the most happening data movement in list.insert().

Time Stats: https://ideone.com/owzWza

Note: This whole solution works well in your case but in case if you've insert some value at the middle of the list you've to think again for another best solution.

于 2015-07-25T09:21:06.943 回答
0

如果你真的想做你在问题中所说的。最快的解决方案(如果您要插入的数组变大)将改为使用自定义容器类。已经指出反转列表会更快,但是每次插入元素时反转列表(并在之后再次反转它)也很昂贵。像这样的东西:

class ReverseList:
    def __init__(self, *args, **kwds):
        self.revlist = list(*args, **kwds)
        self.revlist.reverse()

    def __getitem__(self, key):
        # if you need slicing you need to improve this:
        return self.revlist[-key] 

    def __setitem__(self, key, val):
        # if you need slicing you need to improve this:
        return self.revlist[-key] = val

    def insert(self, pos, val):
        self.revlist.insert(-pos, val)

    # etc
于 2015-07-27T06:41:19.610 回答