python - 重复搜索大量字典的最佳方法

Question

假设我有一个函数，它从 postgres 数据库返回 1000 条记录作为 dicts 列表，看起来像这样（但要大得多）：

[ {"thing_id" : 245, "thing_title" : "Thing title", "thing_url": "thing-url"},
  {"thing_id" : 459, "thing_title" : "Thing title II", "thing_url": "thing-url/2"}]

我有一个过程，需要在此列表中根据给定的 unique 对正确的 dict 进行大约 600 次单独搜索thing_id。与其每次都遍历整个列表，不如创建一个字典的字典，使thing_id每个字典成为一个键，这样不是更有效吗：

{245 : {"thing_id" : 245, "thing_title" : "Thing title", "thing_url": "thing-url"},
 459 : {"thing_id" : 459, "thing_title" : "Thing title II", "thing_url": "thing-url/2"}}

如果是这样，是否有这样做的首选方法？显然我可以通过遍历列表来构建字典。但想知道是否有任何内置方法。如果不是，那么解决此问题的首选方法是什么？另外，有没有比我在这里提出的更好的方法从同一大记录集中重复检索数据，请告诉我。

更新：最终使用 dict 理解：

data = {row["thing_id"]: row for row in rows}

其中 rows 是我使用 psycopg2.extras.DictCursor 进行数据库查询的结果。构建字典足够快，查找速度非常快。

score 1 · Accepted Answer

您可以使用 pandas DataFrame 结构进行多列索引：

>>> result = [
        {"thing_id" : 245, "thing_title" : "Thing title", "thing_url": "thing-url"},
        {"thing_id" : 459, "thing_title" : "Thing title II", "thing_url": "thing-url/2"}
    ]
>>> df = pd.DataFrame(result)
>>> df.set_index('thing_id', inplace=True)
>>> df.sort_index(inplace=True)
>>> df
             thing_title    thing_url
thing_id                             
245          Thing title    thing-url
459       Thing title II  thing-url/2
>>> df.loc[459, 'thing_title']
'Thing title II'

score 0 · Accepted Answer

a = [ {"thing_id" : 245, "thing_title" : "Thing title", "thing_url": "thing-url"}, {"thing_id" : 459, "thing_title" : "Thing title II", "thing_url": "thing-url/2"}]
c = [b.values()[1] for b in a]

python - 重复搜索大量字典的最佳方法

2 回答 2

Related

Reference