python - 使用 Python 从 import.io 加载时出现 JSON 行问题

Question

我很难将 API 响应从 import.io 加载到文件或列表中。

我正在使用的 enpoint 是https://data.import.io/extractor/{0}/json/latest?_apikey={1}

以前我所有的脚本都设置为使用普通的 JSON 并且一切运行良好，但现在他们决定使用 json 行，但不知何故它似乎格式不正确。

我尝试调整脚本的方式是通过以下方式读取 API 响应：

url_call = 'https://data.import.io/extractor/{0}/json/latest?_apikey={1}'.format(extractors_row_dict['id'], auth_key)
r = requests.get(url_call)

with open(temporary_json_file_path, 'w') as outfile:
    json.dump(r.content, outfile)

data = []
with open(temporary_json_file_path) as f:
    for line in f:
        data.append(json.loads(line))

这样做的问题是，当我检查数据 [0] 时，所有的 json 文件内容都被转储到其中......

data[1] = IndexError: list index out of range

这是一个例子data[0][:300]：

u'{"url":"https://www.example.com/de/shop?condition[0]=new&page=1&lc=DE&l=de","result":{"extractorData":{"url":"https://www.example.com/de/shop?condition[0]=new&page=1&lc=DE&l=de","resourceId":"23455234","data":[{"group":[{"Brand":[{"text":"Brand","href":"https://www.example.com'

有人对此 API 的响应有经验吗？除了这个之外，我从其他来源读取的所有其他 jsonline 都可以正常工作。

根据评论编辑：

print repr(open(temporary_json_file_path).read(300))

给出了这个：

'"{\\"url\\":\\"https://www.example.com/de/shop?condition[0]=new&page=1&lc=DE&l=de\\",\\"result\\":{\\"extractorData\\":{\\"url\\":\\"https://www.example.com/de/shop?condition[0]=new&page=1&lc=DE&l=de\\",\\"resourceId\\":\\"df8de15cede2e96fce5fe7e77180e848\\",\\"data\\":[{\\"group\\":[{\\"Brand\\":[{\\"text\\":\\"Bra'

score 5 · Accepted Answer

您的代码中存在双重编码错误：

with open(temporary_json_file_path, 'w') as outfile:
    json.dump(r.content, outfile)

尝试：

with open(temporary_json_file_path, 'w') as outfile:
    outfile.write(r.content)

python - 使用 Python 从 import.io 加载时出现 JSON 行问题

1 回答 1

Related

Reference