2

I'm stuck in doing a very simple import operation in MongoDB. I have a file, 200MB in size, JSON format. Its a feeds dump, format as: {"some-headers":"", "dump":[{"item-id":"item-1"},{"item-id":"item-2"},...]} This json feed contains words in languages other than english too, like Chinese, Japanese, Characters, etc. I tried to do a mongoimport as mongoimport --db testdb --collection testcollection --file dump.json but possibly, because the data is a bit complex, its treating dump as a column, resulting in error, due to 4MB column value limit. I further tried and a python script:

import simplejson
import pymongo
conn = pymongo.Connection("localhost",27017)
db = conn.testdb
c = db.testcollection
o = open("dump.json")
s = simplejson.load(o)
for x in s['dump']:
     c.insert(x)
o.close()

Python is killed while running this thing, possibly due to the very limited resources I'm trying to work with. I reduced the filesize, by getting a new json dump at 50MB, now due to ASCII issues, python is troubling me again. I am looking for options both way using mongoimport and with above python script. Any further solutions shall also be greatly appreciated.

Also, I might some day reach the json dump ~GBs, so if there will be some other solution I should consider then, pl do highlight.

4

0 回答 0