c# - 将 100 000 条记录从一个数据库插入另一个数据库的最快方法是什么？

Question

我有一个移动应用程序。我的客户有一个庞大的数据集 ~ 100.000 条记录。它经常更新。当我们同步时，我们需要从一个数据库复制到另一个数据库。

我已将第二个数据库附加到主数据库，并运行insert into table select * from sync.table.

这非常慢，我认为大约需要 10 分钟。我注意到日志文件逐步增加。

我怎样才能加快速度？

已编辑 1

我关闭了索引，并且关闭了日志。使用

insert into table select * from sync.table

仍然需要 10 分钟。

已编辑 2

如果我运行类似的查询

select id,invitem,invid,cost from inventory where itemtype = 1 
order by invitem limit 50

需要15-20秒。

表架构是：

CREATE TABLE inventory  
('id' INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
 'serverid' INTEGER NOT NULL DEFAULT 0,
 'itemtype' INTEGER NOT NULL DEFAULT 0,
 'invitem' VARCHAR,
 'instock' FLOAT  NOT NULL DEFAULT 0,
 'cost' FLOAT NOT NULL DEFAULT 0,
 'invid' VARCHAR,
 'categoryid' INTEGER  DEFAULT 0,
 'pdacategoryid' INTEGER DEFAULT 0,
 'notes' VARCHAR,
 'threshold' INTEGER  NOT NULL DEFAULT 0,
 'ordered' INTEGER  NOT NULL DEFAULT 0,
 'supplier' VARCHAR,
 'markup' FLOAT NOT NULL DEFAULT 0,
 'taxfree' INTEGER NOT NULL DEFAULT 0,
 'dirty' INTEGER NOT NULL DEFAULT 1,
 'username' VARCHAR,
 'version' INTEGER NOT NULL DEFAULT 15
)

索引的创建方式如下

CREATE INDEX idx_inventory_categoryid ON inventory (pdacategoryid);
CREATE INDEX idx_inventory_invitem ON inventory (invitem);
CREATE INDEX idx_inventory_itemtype ON inventory (itemtype);

我想知道， insert into ... select * from 不是进行海量数据复制的最快内置方法吗？

编辑 3

SQLite 是无服务器的，所以请停止对特定答案进行投票，因为这不是我确定的答案。

score 9 · Accepted Answer

如果目标是某个版本的 MS SQL Server，则SqlBulkCopy为大型数据集提供了一种有效的插入方式，这与命令类似bcp。

您还可以在插入之前禁用/删除非聚集索引，然后重新创建它们。

在 SQLite 中，这些通常非常快：

.dump ?TABLE? ...      Dump the database in an SQL text format
.import FILE TABLE     Import data from FILE into TABLE

也试试：PRAGMA journal_mode = OFF

仅供参考，如果您将其包含在您的软件包中，您应该能够在 Windows Mobile 上运行命令行实用程序。

score 6 · Accepted Answer

我不认为附加两个数据库并运行INSERT INTO foo (SELECT * FROM bar)是最快的方法。如果您在手持设备和服务器（或其他设备）之间进行同步，传输机制是否会成为瓶颈？或者这两个数据库文件是否已经在同一个文件系统上？如果设备上的文件系统是较慢的闪存，这可能是瓶颈吗？

您是否能够在您的设备上编译/运行原始 SQLite C 代码？（我认为 RAW sqlite3 合并应该为 WinCE/Mobile 编译）如果是这样，你愿意：

编写一些 C 代码（使用 SQLite C API）
通过关闭磁盘日志来增加数据丢失的风险

应该可以编写一个小的独立可执行文件来非常快速地在两个数据库之间复制/同步 100K 记录。

我在这里发布了一些关于优化 SQLite 插入的知识：提高 SQLite 的每秒插入性能？

编辑： 用真实的代码试过这个......

我不知道构建 Windows Mobile 可执行文件所涉及的所有步骤，但SQLite3 合并应该使用 Visual Studio 开箱即用地编译。这是一个示例main.c程序，它打开两个 SQLite 数据库（两者必须具有相同的模式 - 请参阅#define TABLE语句）并执行 SELECT 语句，然后将结果行绑定到 INSERT 语句：

/*************************************************************
** The author disclaims copyright to this source code.  In place of
** a legal notice, here is a blessing:
**
**    May you do good and not evil.
**    May you find forgiveness for yourself and forgive others.
**    May you share freely, never taking more than you give.
**************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include "sqlite3.h"

#define SOURCEDB "C:\\source.sqlite"
#define DESTDB "c:\\dest.sqlite"

#define TABLE "CREATE TABLE IF NOT EXISTS TTC (id INTEGER PRIMARY KEY, Route_ID TEXT, Branch_Code TEXT, Version INTEGER, Stop INTEGER, Vehicle_Index INTEGER, Day Integer, Time TEXT)"
#define BUFFER_SIZE 256

int main(int argc, char **argv) {

    sqlite3 * sourceDB;
    sqlite3 * destDB;

    sqlite3_stmt * insertStmt;
    sqlite3_stmt * selectStmt;

    char * insertTail = 0;
    char * selectTail = 0;

    int n = 0;
    int result = 0;
    char * sErrMsg = 0;
    clock_t cStartClock;

    char sInsertSQL [BUFFER_SIZE] = "\0";
    char sSelectSQL [BUFFER_SIZE] = "\0";

    /* Open the Source and Destination databases */
    sqlite3_open(SOURCEDB, &sourceDB);
    sqlite3_open(DESTDB, &destDB);

    /* Risky - but improves performance */
    sqlite3_exec(destDB, "PRAGMA synchronous = OFF", NULL, NULL, &sErrMsg);
    sqlite3_exec(destDB, "PRAGMA journal_mode = MEMORY", NULL, NULL, &sErrMsg);

    cStartClock = clock(); /* Keep track of how long this took*/

    /* Prepared statements are much faster */
    /* Compile the Insert statement */
    sprintf(sInsertSQL, "INSERT INTO TTC VALUES (NULL, @RT, @BR, @VR, @ST, @VI, @DT, @TM)");
    sqlite3_prepare_v2(destDB, sInsertSQL, BUFFER_SIZE, &insertStmt, &insertTail);

    /* Compile the Select statement */
    sprintf(sSelectSQL, "SELECT * FROM TTC LIMIT 100000");
    sqlite3_prepare_v2(sourceDB, sSelectSQL, BUFFER_SIZE, &selectStmt, &selectTail);

    /* Transaction on the destination database */
    sqlite3_exec(destDB, "BEGIN TRANSACTION", NULL, NULL, &sErrMsg);

    /* Execute the Select Statement.  Step through the returned rows and bind
    each value to the prepared insert statement.  Obviously this is much simpler
    if the columns in the select statement are in the same order as the columns
    in the insert statement */
    result = sqlite3_step(selectStmt);
    while (result == SQLITE_ROW)
    {

        sqlite3_bind_text(insertStmt, 1, sqlite3_column_text(selectStmt, 1), -1, SQLITE_TRANSIENT); /* Get Route */
        sqlite3_bind_text(insertStmt, 2, sqlite3_column_text(selectStmt, 2), -1, SQLITE_TRANSIENT); /* Get Branch */
        sqlite3_bind_text(insertStmt, 3, sqlite3_column_text(selectStmt, 3), -1, SQLITE_TRANSIENT); /* Get Version */
        sqlite3_bind_text(insertStmt, 4, sqlite3_column_text(selectStmt, 4), -1, SQLITE_TRANSIENT); /* Get Stop Number */
        sqlite3_bind_text(insertStmt, 5, sqlite3_column_text(selectStmt, 5), -1, SQLITE_TRANSIENT); /* Get Vehicle */
        sqlite3_bind_text(insertStmt, 6, sqlite3_column_text(selectStmt, 6), -1, SQLITE_TRANSIENT); /* Get Date */
        sqlite3_bind_text(insertStmt, 7, sqlite3_column_text(selectStmt, 7), -1, SQLITE_TRANSIENT); /* Get Time */

        sqlite3_step(insertStmt);       /* Execute the SQL Insert Statement (Destination Database)*/
        sqlite3_clear_bindings(insertStmt); /* Clear bindings */
        sqlite3_reset(insertStmt);      /* Reset VDBE */

        n++;

        /* Fetch next from from source database */
        result = sqlite3_step(selectStmt);

    }

    sqlite3_exec(destDB, "END TRANSACTION", NULL, NULL, &sErrMsg);

    printf("Transfered %d records in %4.2f seconds\n", n, (clock() - cStartClock) / (double)CLOCKS_PER_SEC);

    sqlite3_finalize(selectStmt);
    sqlite3_finalize(insertStmt);

    /* Close both databases */
    sqlite3_close(destDB);
    sqlite3_close(sourceDB);

    return 0;
}

在我的 Windows 桌面机器上，这段代码在 1.20 秒内复制了 100k 条记录source.sqlite。dest.sqlite 我不知道你会在带有闪存的移动设备上看到什么样的性能（但我很好奇）。

score 4 · Accepted Answer

我现在在移动，所以我无法发布非常详细的答案，但这可能值得一读：

http://sqlite.org/cvstrac/wiki?p=SpeedComparison

如您所见，SQLite 3 在使用索引和/或事务时执行 INSERT 的速度更快。此外，INSERTs FROM SELECTs 似乎不是 SQLite 的强项。

score 1 · Accepted Answer

INSERT INTO SELECT * from connected databases 是 SQLite 中最快的可用选项。有几件事需要注意。

交易。确保整个事情都在交易中。这真的很关键。如果只是一条 SQL 语句，那么它并不重要，但你说期刊“逐步”增加，这表明它不止一条语句。
触发器。你有触发器运行吗？这些显然会影响性能。
约束。你有不必要的限制吗？您不能禁用它们或删除/重新添加它们，因此如果它们是必要的，您对它们无能为力，但这是需要考虑的事情。

您已经提到关闭索引。

score 1 · Accepted Answer

所有 100 000 条记录是否经常更改？或者它是一个变化的子集？

如果是这样，您应该考虑添加一个 updated_since_last_sync 列，该列在进行更新时会被标记，因此在下一次同步期间您只复制实际更改的记录。复制记录后，将标志列设置回零。

score 0 · Accepted Answer

0

仅发送增量。即只发送差异。即只发送已更改的内容。

于 2010-01-23T22:56:38.613 回答

score 0 · Accepted Answer

What about storing the sync.table database table within a separate file? That way you just need to make a copy of that file in order to sync. I bet that's way faster than syncing by SQL.

score 0 · Accepted Answer

0

如果您还没有，则需要将其包装在事务中。产生显着的速度差异。

于 2010-02-03T16:07:03.207 回答

c# - 将 100 000 条记录从一个数据库插入另一个数据库的最快方法是什么？

8 回答 8

Related

Reference