python - Python：将 JSON 的整个目录转换为 Python 字典以发送到 MongoDB

Question

我对 Python 比较陌生，对 MongoDB 也很陌生（因此，我只关心获取文本文件并转换它们）。我目前正在尝试将一堆 JSON 格式的 .txt 文件移动到 MongoDB 中。因此，我的方法是打开目录中的每个文件，读取每一行，将其从 JSON 转换为字典，然后将作为字典的JSON 行覆盖。然后它将以一种格式发送到 MongoDB

（如果我的推理有任何缺陷，请指出）

目前，我写了这个：

"""
Kalil's step by step iteration / write.

JSON dumps takes a python object and serializes it to JSON.
Loads takes a JSON string and turns it into a python dictionary.
So we return json.loads so that we can take that JSON string from the tweet and save it as a dictionary for Pymongo
"""

import os
import json
import pymongo

rootdir='~/Tweets'

def convert(line):
    line = file.readline()
    d = json.loads(lines)
    return d


for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        f=open(file, 'r')
        lines = f.readlines()
        f.close()
        f=open(file, 'w')
        for line in lines:
            newline = convert(line)
            f.write(newline)
        f.close()

但这不是写作。哪个...根据经验，如果您没有得到想要的效果，那么您在某处犯了错误。

有没有人有什么建议？

score 3 · Accepted Answer

当你解码一个 json 文件时，你不需要逐行转换，因为解析器会为你遍历文件（除非你每行有一个 json 文档）。

加载 json 文档后，您将拥有一个字典，它是一种数据结构，如果不先将其序列化为某种格式，例如 json、yaml 或许多其他格式（mongodb 使用的格式称为bson，但您的驱动程序将为您处理编码）。

加载 json 文件并将其转储到 mongo 的整个过程实际上非常简单，看起来像这样：

import json
from glob import glob
from pymongo import Connection

db = Connection().test

for filename in glob('~/Tweets/*.txt'):
    with open(filename) as fp:
        doc = json.load(fp)

    db.tweets.save(doc)

score 1 · Accepted Answer

python中的字典是存在于程序中的对象，除非您对其进行腌制，否则不能将字典直接保存到文件中（腌制是将对象保存在文件中的一种方法，以便以后可以检索它）。现在我认为更好的方法是从文件中读取行，加载将 json 转换为字典的 json 并将该信息立即保存到 mongodb 中，无需将该信息保存到文件中。

python - Python：将 JSON 的整个目录转换为 Python 字典以发送到 MongoDB

2 回答 2

Related

Reference