python - 如何将从salesforce中提取的表格格式化为python？

Question

我能够使用 python 从 salesforce 中提取一些字段。

我使用了以下代码块：

!pip install simple_salesforce 

from simple_salesforce import Salesforce
import pandas as pd

sf = Salesforce(
username='', 
password='', 
security_token='')

sf_data = sf.query_all("SELECT Brand_Name__c,Name FROM AuthorisedProduct__c")

sf_df = pd.DataFrame(sf_data)

sf_df.head()

此过程将所有项目放在一个“记录”字段中。

记录	总大小
OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fmAAC')])), ('Brand_Name__c ', 'ABB'), ('名称', 'UNO-DM-1.2-TL-PLUS-B')])	14000
OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fnAAC')])), ('Brand_Name__c ', 'ABB'), ('名称', 'UNO-DM-1.2-TL-PLUS-SB')])	14000
OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1foAAC')])), ('Brand_Name__c ', 'ABB'), ('名称', 'UNO-DM-2.0-TL-PLUS-B')])	14000

请注意，记录下有 14000 个值。我想在一个简单的数据框中只有两个字段。带有“Brand_Name__c”和“名称”字段的表。

品牌名称__C	姓名
ABB	UNO-DM-2.0-TL-PLUS-B
ABB	UNO-DM-1.2-TL-PLUS-SB

我们将有一个 14000 x 2 的矩阵。

请告知如何实现？

还有，如何扭转这个过程？

非常感谢大家。

score 1 · Accepted Answer

您可以解压列OrderedDict中的对象records：

from collections import OrderedDict
import pandas as pd

df = pd.DataFrame({
    'records':[
        OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fmAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-1.2-TL-PLUS-B')]),
        OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fnAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-1.2-TL-PLUS-SB')]),
        OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1foAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-2.0-TL-PLUS-B')])
    ],
    'total size': [14000]*3
})

df['Brand_Name__c'] = df['records'].apply(lambda x: x['Brand_Name__c'])
df['Name'] = df['records'].apply(lambda x: x['Name'])

结果：

>>> df
                                             records  total size Brand_Name__c                   Name
0  {'attributes': {'type': 'AuthorisedProduct__c'...       14000           ABB   UNO-DM-1.2-TL-PLUS-B
1  {'attributes': {'type': 'AuthorisedProduct__c'...       14000           ABB  UNO-DM-1.2-TL-PLUS-SB
2  {'attributes': {'type': 'AuthorisedProduct__c'...       14000           ABB   UNO-DM-2.0-TL-PLUS-B

score 1 · Accepted Answer

您必须了解 Salesforce 发送的 JSON 响应的实际形状，其中包括包含所有数据的顶级"records"键。"attributes"此外，除了您实际请求的字段的数据之外，每个记录条目都包含一个键。您无法更改 JSON 响应的形状。

simple_salesforce文档中提供了一个示例，展示了如何为 Pandas 消化这个 API 响应：

从 SFDC API 查询 (ex.query,query_all) 生成 Pandas 数据框

import pandas as pd

sf.query("SELECT Id, Email FROM Contact")

df = pd.DataFrame(data['records']).drop(['attributes'],axis=1)

python - 如何将从salesforce中提取的表格格式化为python？

2 回答 2

Related

Reference