python - 在 Python 列表中插入值

Question

我正在研究一个解析文本文件的脚本，试图对其进行标准化，以便能够将其插入数据库。数据代表由 1 个或多个作者撰写的文章。我遇到的问题是，因为没有固定数量的作者，我在输出文本文件中得到了可变数量的列。例如。

author1, author2, author3, this is the title of the article
author1, author2, this is the title of the article
author1, author2, author3, author4, this is the title of the article

这些结果给我的最大列数为 5。因此，对于前 2 篇文章，我需要添加空白列，以便输出具有偶数列。最好的方法是什么？我的输入文本是制表符分隔的，我可以通过在制表符上拆分来轻松地遍历它们。

score 2 · Accepted Answer

假设您已经拥有最大列数并且已经将它们分成列表（我将假设您将其放入自己的列表中），您应该能够只使用 list.insert(-1,item)添加空列：

def columnize(mylists, maxcolumns):
    for i in mylists:
        while len(i) < maxcolumns:
            i.insert(-1,None)

mylists = [["author1","author2","author3","this is the title of the article"],
           ["author1","author2","this is the title of the article"],
           ["author1","author2","author3","author4","this is the title of the article"]]

columnize(mylists,5)
print mylists

[['author1', 'author2', 'author3', None, 'this is the title of the article'], ['author1', 'author2', None, None, 'this is the title of the article'], ['author1', 'author2', 'author3', 'author4', 'this is the title of the article']]

使用列表推导不会破坏原始列表的替代版本：

def columnize(mylists, maxcolumns):
    return [j[:-1]+([None]*(maxcolumns-len(j)))+j[-1:] for j in mylists]

print columnize(mylists,5)

[['author1', 'author2', 'author3', None, 'this is the title of the article'], ['author1', 'author2', None, None, 'this is the title of the article'], ['author1', 'author2', 'author3', 'author4', 'this is the title of the article']]

score 1 · Accepted Answer

如果我误解了，请原谅我，但听起来你正在以一种困难的方式解决这个问题。将您的文本文件转换为将标题映射到一组作者的字典非常容易：

>>> lines = ["auth1, auth2, auth3, article1", "auth1, auth2, article2","auth1, article3"]
>>> d = dict((x[-1], x[:-1]) for x in [line.split(', ') for line in lines])
>>> d
{'article2': ['auth1', 'auth2'], 'article3': ['auth1'], 'article1': ['auth1', 'auth2', 'auth3']}
>>> total_articles = len(d)
>>> total_articles
3
>>> max_authors = max(len(val) for val in d.values())
>>> max_authors
3
>>> for k,v in d.iteritems():
...     print k
...     print v + [None]*(max_authors-len(v))
... 
article2
['auth1', 'auth2', None]
article3
['auth1', None, None]
article1
['auth1', 'auth2', 'auth3']

然后，如果你真的想要，你可以使用python 内置的csv 模块输出这些数据。或者，您可以直接输出您需要的 SQL。

您多次打开同一个文件并多次读取它，只是为了获得可以从内存中的数据中得出的计数。请不要出于这些目的多次阅读该文件。

python - 在 Python 列表中插入值

2 回答 2

Related

Reference