python - 寻找使用 glom 获取嵌套数据的可能更好的方法？

Question

我有一个来自系统的特别讨厌的 stats 对象，我需要从中检索数据（为简洁起见，显示了许多统计条目中的两个）。

 'https://localhost/mgmt/tm/sys/performance/all-stats/TMM%20Memory%20Used': {'nestedStats': {'entries': {'Average': {'description': '5'},
                                                                                                         'Current': {'description': '5'},
                                                                                                         'Max(since 2019_11_12T02:47:10Z)': {'description': '5'},
                                                                                                         'Memory Used': {'description': 'TMM '
                                                                                                                                        'Memory '
                                                                                                                                        'Used'}}}},
 'https://localhost/mgmt/tm/sys/performance/all-stats/Utilization': {'nestedStats': {'entries': {'Average': {'description': '9'},
                                                                                                 'Current': {'description': '10'},
                                                                                                 'Max(since 2019_11_12T02:47:10Z)': {'description': '53'},
                                                                                                 'System CPU Usage': {'description': 'Utilization'}}}}}

目前我在嵌套堆栈中多次使用 .get 方法，但本周末我在 Talk Python 上听了glom 模块的作者并认为这对我来说可能是一个更清洁的解决方案。确实如此，因为这段代码使我将所有数据都放在一个循环中，而没有疯狂的 get 方法层（上图的第一个示例，我今晚正在研究）。外键是长 URL，内键是 avg/current/max/desc。

stats = b.tm.sys.performances.all_stats.load()
for k, v in stats.entries.items():
    print('\n')
    spec = f'entries.{k}.nestedStats.entries'
    v_stats = glom(stats, spec)
    for k, v, in v_stats.items():
        spec = f'{k}.description'
        stat_vals = glom(v_stats, spec)
        print(f'{k}: {stat_vals}')

结果是我需要的数据：

Average: 5
Current: 5
Max(since 2019_11_12T02:47:10Z): 5
Memory Used: TMM Memory Used

也就是说，此时我并不能真正控制数据，我只是在打印它。我认为我还没有了解glom的力量，并且很好奇是否有人可以指出一个有助于我理解的例子？最终目标是将所有这些数据扁平化为包含 4 项字典的单个列表。

score 1 · Accepted Answer

首先，在您尝试之前，请确保将 glom 更新到当前版本 19.11.0 或更高版本。

你所要求的，被glom的文档称为数据驱动分配，而不是glom的力量。

在此处查看 glom 文档

要使其正常工作，您可能需要 lambda 和/或常规 Python 代码。

下面是我的工作尝试，将您的示例行复制到变量d中。

from glom import glom, Call, T, Iter

d = { ... }  # put your example lines into this dictionary.

def get_desc(subdict):
    return {k: v.get('description', None) 
            for k,v in subdict[1]['nestedStats']['entries'].items()}

spec = (Call(list, args=(T.items(),) ), Iter().map(get_desc).all())

result = glom(d, spec)

print(result)

结果是

[
{'Average': '5', 'Current': '5', 'Max(since 2019_11_12T02:47:10Z)': '5', 'Memory Used': 'TMM Memory Used'}, 
{'Average': '9', 'Current': '10', 'Max(since 2019_11_12T02:47:10Z)': '53', 'System CPU Usage': 'Utilization'}
]

更新

下面的版本得到了相同的结果，但避免了对辅助函数的需要。

规范的作用：

调用将外部字典转换为元组列表
Iter 循环遍历列表。对于每个项目：
1. 取元组的第二个元素
2. 获取 nestedStats.entries （这是另一个字典）
3. Call 将此 dict 转换为元组列表
4. 将此列表转换为带有键和描述的字典列表
5. 将字典列表合并为一个字典
从迭代中获取所有结果

我建议尝试这个并删除部分规范，看看会发生什么......

from glom import glom, Call, T, Iter, merge

# d = { ... }  # put your example lines into this dictionary.

spec = (
    Call(list, args=(T.items(),)),
    Iter(
        (
            T[1],
            "nestedStats.entries",
            Call(list, args=(T.items(),)),
            [{T[0]: (T[1], "description")}],
            merge,
        )
    ).all(),
)

result = glom(d, spec)

print(result)

python - 寻找使用 glom 获取嵌套数据的可能更好的方法？

1 回答 1

更新

Related

Reference