python - 使用生成器理解与列表理解从文件中读取行

Question

以下代码来自 Jake VanderPlas 的 Python Data Science Handbook 第 3 章。文件中的每一行都是有效的 JSON。虽然我认为文件的细节对于回答这个问题并不重要，但文件的 url 是https://github.com/fictivekin/openrecipes。

# read the entire file into a Python array        
with open('recipeitems-latest.json', 'r') as f:            
    # Extract each line            
    data = (line.strip() for line in f)            
    # Reformat so each line is the element of a list            
    data_json = "[{0}]".format(','.join(data))        
# read the result as a JSON        
recipes = pd.read_json(data_json)

两个问题：

为什么在代码的第二行使用生成器推导而不是列表推导？由于所需的最终数据结构是一个列表，我想知道为什么不只使用列表而不是先使用生成器然后使用列表？
是否可以使用列表理解代替？

score 0 · Accepted Answer

你在这里有两个问题：

为什么发电机补偿？因为你事先并不知道 JSON 的大小。所以最好是安全的，不要将整个文件加载到内存中。
是的，可以使用列表理解。只需将括号替换为方括号即可。

>>> f = open('things_which_i_should_know')
>>> data = (line.strip() for line in f)
>>> type(data)
<class 'generator'>
>>> data = [line.strip() for line in f]
>>> type(data)
<class 'list'>
>>>

请参阅官方文档了解更多信息。

使用列表推导，您可以得到一个 Python 列表；stripped_list 是包含结果行的列表，而不是迭代器。生成器表达式返回一个迭代器，它根据需要计算值，而不需要一次实现所有值。这意味着如果您正在使用返回无限流或大量数据的迭代器，列表推导式就没有用了。在这些情况下，生成器表达式更可取。

python - 使用生成器理解与列表理解从文件中读取行

1 回答 1

Related

Reference