node.js - 带有 pg-promise 的大量插入

Question

我在用着pg-promise，我想对一张表进行多次插入。我已经看到了一些解决方案，例如使用 pg-promise 进行多行插入以及如何使用 node-postgres 将多行正确插入到 PG 中？，我可以使用 pgp.helpers.concat 来连接多个选择。

但是现在，我需要在一个表中插入很多测量值，超过 10,000 条记录，并且在https://github.com/vitaly-t/pg-promise/wiki/Performance-Boost中说：“多少条记录您可以像这样连接 - 取决于记录的大小，但我永远不会使用这种方法超过 10,000 条记录。因此，如果您必须插入更多记录，您可能希望将它们拆分成这样的连接批次，然后执行它们逐个。”

我阅读了所有文章，但我无法弄清楚如何将我的插入“拆分”成批次，然后一个接一个地执行它们。

谢谢！

score 3 · Accepted Answer

更新

最好阅读以下文章：数据导入。

作为pg-promise的作者，我不得不最终为这个问题提供正确的答案，因为之前发表的那个并没有真正做到公正。

为了插入大量/无限数量的记录，您的方法应该基于方法sequence，这在任务和事务中可用。

var cs = new pgp.helpers.ColumnSet(['col_a', 'col_b'], {table: 'tableName'});

// returns a promise with the next array of data objects,
// while there is data, or an empty array when no more data left
function getData(index) {
    if (/*still have data for the index*/) {
        // - resolve with the next array of data
    } else {
        // - resolve with an empty array, if no more data left
        // - reject, if something went wrong
    }        
}

function source(index) {
    var t = this;
    return getData(index)
        .then(data => {
            if (data.length) {
                // while there is still data, insert the next bunch:
                var insert = pgp.helpers.insert(data, cs);
                return t.none(insert);
            }
            // returning nothing/undefined ends the sequence
        });
}

db.tx(t => t.sequence(source))
    .then(data => {
        // success
    })
    .catch(error => {
        // error
    });

从性能和负载限制的角度来看，这是将大量行插入数据库的最佳方法。

您所要做的就是getData根据您的应用程序的逻辑实现您的功能，即您的大数据来自哪里，基于index序列，一次返回大约 1,000 - 10,000 个对象，具体取决于对象的大小和数据可用性。

另请参阅一些 API 示例：

相关问题：具有大量查询的 node-postgres。

如果您需要获取所有插入记录的生成 id-s，您可以将两行更改如下：

// return t.none(insert);
return t.map(insert + 'RETURNING id', [], a => +a.id);

和

// db.tx(t => t.sequence(source))
db.tx(t => t.sequence(source, {track: true}))

请小心，因为在内存中保留过多的记录 ID 会导致过载。

score 1 · Accepted Answer

我认为天真的方法会奏效。

尝试将您的数据拆分为多条 10,000 条或更少的记录。我会尝试使用这篇文章中的解决方案拆分数组。

然后，用pg-promise多行插入每个数组，并在事务中一一执行。

编辑：感谢@vitaly-t 提供了精彩的图书馆并改进了我的答案。

另外不要忘记将您的查询包装在事务中，否则它将耗尽连接。

为此，请使用 pg-promise 中的批处理函数来异步解析所有查询：

// split your array here to get splittedData
int i = 0 
var cs = new pgp.helpers.ColumnSet(['col_a', 'col_b'], {table: 'tmp'})

// values = [..,[{col_a: 'a1', col_b: 'b1'}, {col_a: 'a2', col_b: 'b2'}]]
let queries = []
for (var i = 0; i < splittedData.length; i++) {
   var query = pgp.helpers.insert(splittedData[i], cs)
   queries.push(query)
}

db.tx(function () {
   this.batch(queries)
})
.then(function (data) {
   // all record inserted successfully ! 
}
.catch(function (error) {
    // error;
});

node.js - 带有 pg-promise 的大量插入

2 回答 2

Related

Reference