1

我有以下熊猫 df :

id  mobile
1   9998887776
2   8887776665
1   7776665554
2   6665554443
3   5554443332

我想按 id 和预期结果进行分组,如下所示:

id   mobile
1    [{"9998887776": {"status": "verified"}},{"7776665554": {"status": "verified"}}]
2    [{"8887776665": {"status": "verified"}},{"6665554443": {"status": "verified"}}]
3    [{"5554443332": {"status": "verified"}}]

我知道 to_json 方法在这里无济于事,我必须编写 UDF。但我对此很陌生,有点卡在这里。

4

1 回答 1

2

对字典列表使用GroupBy.apply带有自定义格式的列表推导:

f = lambda x: [{y: {"status": "verified"}} for y in x]
df = df.groupby('id')['mobile'].apply(f).reset_index()
print (df)
   id                                             mobile
0   1  [{9998887776: {'status': 'verified'}}, {777666...
1   2  [{8887776665: {'status': 'verified'}}, {666555...
2   3             [{5554443332: {'status': 'verified'}}]

如果需要json格式:

import json

f = lambda x: json.dumps([{y: {"status": "verified"}} for y in x])
df = df.groupby('id')['mobile'].apply(f).reset_index()
print (df)
   id                                             mobile
0   1  [{"9998887776": {"status": "verified"}}, {"777...
1   2  [{"8887776665": {"status": "verified"}}, {"666...
2   3           [{"5554443332": {"status": "verified"}}]
于 2019-11-26T08:16:27.087 回答