python - 将二维字典转换为 numpy 矩阵

Question

我有一本像这样的大字典：

d[id1][id2] = value

例子：

books["auth1"]["humor"] = 20
books["auth1"]["action"] = 30
books["auth2"]["comedy"] = 20

等等..

每个“auth”键都可以有任何一组与之关联的“流派”。关键项目的价值是他们写的书的数量。

现在我想要的是将它转换为矩阵的形式......就像：

                    "humor"       "action"        "comedy"
      "auth1"         20            30               0
      "auth2"          0            0                20

我该怎么做呢？谢谢

score 27 · Accepted Answer

熊猫做得很好：

books = {}
books["auth1"] = {}
books["auth2"] = {}
books["auth1"]["humor"] = 20
books["auth1"]["action"] = 30
books["auth2"]["comedy"] = 20

from pandas import *

df = DataFrame(books).T.fillna(0)

输出是：

       action  comedy  humor
auth1      30       0     20
auth2       0      20      0

score 10 · Accepted Answer

使用列表推导将 dict 转换为列表列表和/或 numpy 数组：

np.array([[books[author][genre] for genre in sorted(books[author])] for author in sorted(books)])

编辑

显然，每个子词典中有不规则数量的键。列出所有类型：

genres = ['humor', 'action', 'comedy']

然后以正常方式遍历字典：

list_of_lists = []
for author_name, author in sorted(books.items()):
    titles = []
    for genre in genres:
        try:
            titles.append(author[genre])
        except KeyError:
            titles.append(0)
    list_of_lists.append(titles)

books_array = numpy.array(list_of_lists)

基本上我正在尝试将每个键中的值附加genres到列表中。如果密钥不存在，则会引发错误。我发现了错误，并将 0 附加到列表中。

score 0 · Accepted Answer

在 2018 年，我认为 Pandas 0.22开箱即用。具体请查看的from_dict类方法DataFrame。

books = {}
books["auth1"] = {}
books["auth2"] = {}
books["auth1"]["humor"] = 20
books["auth1"]["action"] = 30
books["auth2"]["comedy"] = 20

pd.DataFrame.from_dict(books, orient='columns', dtype=None)

python - 将二维字典转换为 numpy 矩阵

3 回答 3

Related

Reference