3

我需要在我的 CouchDB 数据库中批量插入文档。我正在尝试在这里遵循手册: http ://wiki.apache.org/couchdb/HTTP_Bulk_Document_API

这是我的脚本:

~$ DB="http://localhost:5984/employees"
~$ curl -H "Content-Type:application/json" -d @employees_selfContained.json -vX POST $DB/_bulk_docs

文件 employees_selfContained.json 是一个巨大的文件 = 465 MB。我已经使用 JSONLint 对其进行了验证,并没有发现任何问题。

这是 curl 的详细输出:

* About to connect() to 127.0.0.1 port 5984 (#0)
* Trying 127.0.0.1... connected
* Connected to 127.0.0.1 (127.0.0.1) port 5984 (#0)
> POST /employees/_bulk_docs HTTP/1.1
> User-Agent: curl/7.19.7 (i486-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15
> Host: 127.0.0.1:5984
> Accept: */*
> Content-Type:application/json
> Content-Length: 439203931
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* Empty reply from server
* Connection #0 to host 127.0.0.1 left intact
curl: (52) Empty reply from server
* Closing connection #0

我怎样才能从那个巨大的单个文件中批量插入?如果可能的话,我不希望将文件拆分成更小的大小..

编辑:如果有人想知道,我正在尝试将此架构: http ://dev.mysql.com/doc/employee/en/sakila-structure.html 转换 为自包含文档数据库,其结构如下:

{
    "docs": [
        {
            "emp_no": ..,
            "birth_date": ..,
            "first_name": ..,
            "last_name" : ..,
            "gender": ..,
            "hire_date": .., 
            "titles": 
                [
                    {
                    "title": ..,
                    "from_date": .., 
                    "to_date": ..
                    },
                    {..}
                ], 
            "salaries" : 
                [
                    {
                    "salary": ..,
                    "from_date": ..,
                    "to_date": ..
                    },
                    {..}                
                ], 
            "dept_emp": 
                [ 
                    {
                    "dept_no": ..,
                    "from_date": ..,
                    "to_date":
                    },
                    {..}
                ], 
            "dept_manager": 
                [ 
                    {
                    "dept_no": ..,
                    "from_date": ..,
                    "to_date": ..
                    },
                    {..}
                ], 
            "departments":
                [
                    {
                    "dept_no": .., 
                    "dept_name": ..
                    },
                    {..}
                ]
        } ,
        .
        .
        {..}
    ]
} 
4

1 回答 1

1

遍历 JSON 并分批插入 10-50k 个文档。

于 2012-06-11T13:02:14.107 回答