0

大家下午。我对 SQLite 不是很熟悉,所以我没有弄乱数据库的所有设置。我对 SQL Server、Oracle 甚至一些 Access 和 mySQL 都比较熟悉。好吧,目前,我正在获取一个包含 110,000 多条记录的文件,并逐行读取文件,解析数据并将插入语句运行到表中。该表取决于作为行的第一个字段的记录类型。好吧,我现在正在加载它,它已经运行了 12 分钟(在我写这篇文章的时候)并且只导入了 14,000 条记录。算一下,这意味着它需要 1 小时 15 分钟到 1 小时 30 分钟。取决于我系统的其余部分当时的行为方式。因为有不同的记录类型,我不能 如果有 SQLite 选项(不确定是否有),则不要进行批量插入。这是作为后台工作人员运行的。下面是提取和解析数据的函数,以及将数据插入数据库的函数。请记住,这是一个 MVC 格式的 C# 应用程序(就像我控制它并且没有时间重组它时那样):

MainForm.cs 后台工作函数

#region Background Worker Functions

    #region private void InitializeBackgroundWorker()
    /*************************************************************************************
    *************************************************************************************/
    private void InitializeBackgroundWorker()
    {
        backgroundWorker.DoWork +=
            new DoWorkEventHandler(backgroundWorker1_DoWork);
        backgroundWorker.RunWorkerCompleted +=
            new RunWorkerCompletedEventHandler(
        backgroundWorker1_RunWorkerCompleted);
        backgroundWorker.ProgressChanged +=
            new ProgressChangedEventHandler(
        backgroundWorker1_ProgressChanged);
    }
    #endregion

/*****************************************************************************************************************************************************************************************************/

    #region private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
    /*************************************************************************************
    *************************************************************************************/
    private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
    {
        // Get the BackgroundWorker that raised this event.
        BackgroundWorker worker = sender as BackgroundWorker;

        // Assign the result of the computation
        // to the Result property of the DoWorkEventArgs
        // object. This is will be available to the 
        // RunWorkerCompleted eventhandler.

        //Creates a static singleton file list.  Remains on the stack and can be accessed anywhere without
        // reinstatiating
        object[] obj = (object[])e.Argument;
        string fileName = obj[0].ToString();
        DataController controller = new DataController(worker, e);
        controller.FileName = fileName;
        try
        {
            if (strProcess == "Import")
            {
                controller.Import();
            }
            else if (strProcess == "Export")
            {
                controller.ExportToExcel();
            }
        }
        catch (Exception ex)
        {
            MessageBox.Show(ex.Message.ToString());
        }
    }
    #endregion

/*****************************************************************************************************************************************************************************************************/

    #region private void backgroundWorker1_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
    /*************************************************************************************
    *************************************************************************************/
    private void backgroundWorker1_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
    {
        if (e.Error != null)
        {
            MessageBox.Show(e.Error.StackTrace);
        }
        else
        {
            this.toolStripStatusLabel1.Text = "Import complete";
            generateReport();
            treeViewFigure.Nodes.Clear();
            BuildTree();
            treeViewFigure.TopNode.ExpandAll();
            labelIPBNumber.Text = controller.IPBNumber;
            this.Text += "IPB: " + labelIPBNumber.Text;

            cmbIndentureLevel.Items.Clear();
        }
    }
    #endregion

/*****************************************************************************************************************************************************************************************************/

    #region private void backgroundWorker1_ProgressChanged(object sender, ProgressChangedEventArgs e)
    /*************************************************************************************
    *************************************************************************************/
    private void backgroundWorker1_ProgressChanged(object sender, ProgressChangedEventArgs e)
    {
        string stat = e.UserState.ToString();
        this.toolStripStatusLabel1.Text = "";
        this.toolStripStatusLabel1.Text = stat;
        this.toolStripProgressBar1.Value = e.ProgressPercentage;
    }
    #endregion

#endregion

Importer.cs 导入函数

    #region public void Import(string fileName)
    /*************************************************************************************
    *************************************************************************************/
    public void Import(string fileName)
    {
        if (!File.Exists(fileName))
        {
            throw new FileNotFoundException();
        }

        StreamReader read = File.OpenText(fileName);
        List<RecordBase> List = new List<RecordBase>();
        DataFactory factory = DataFactory.BuildFactory();

        int nLines = 0;

        while (!read.EndOfStream)
        {
            read.ReadLine();
            nLines++;
        }

        read.Close();
        read = File.OpenText(fileName);

        factory.lstObservers = _observers;
        factory.ClearDB();

        int count = 1;

        while (!read.EndOfStream)
        {
            string[] fields = read.ReadLine().Split('|');
            List<string> lstStr = new List<string>();
            foreach (string str in fields)
            {
                lstStr.Add(str);
            }

            lstStr.RemoveAt(fields.Length - 1);
            fields = lstStr.ToArray();

            string strValues = string.Join("','", fields);
            strValues = "'" + strValues + "'";
            if (fields.Length >= 39 && fields[0] == "03")
            {
                factory.ImportTaggedRecord(fields[38], count);
                int nIndex = strValues.IndexOf(fields[38]);
                strValues = strValues.Substring(0, nIndex - 2);
            }

            factory.ImportIPB(strValues, fields[0], count);

            progress.ProgressComplete = (count * 100) / nLines;
            progress.Message = "Importing Record: " + count++.ToString();
            Notify();
        }
    }
    #endregion

DataFactory.cs ImportIPB 函数

    #region public void ImportIPB(string strValues, string strType)
    /*************************************************************************************
    *************************************************************************************/
    public void ImportIPB(string strValues, string strType, int nPosition)
    {
        string strCommand = string.Empty;

        switch (strType)
        {
            case "01":
                strCommand = Queries.strIPBInsert;
                break;
            case "02":
                strCommand = Queries.strFigureInsert;
                break;
            case "03":
                strCommand = Queries.strPartInsert;
                break;
        }

        ExecuteNonQuery(strCommand + strValues + ", " + nPosition.ToString() + ")");
    }
    #endregion

Database.cs ExecuteNonQuery 方法

    #region public void ExecuteNonQuery(string strSQL)
    /*************************************************************************************
    *************************************************************************************/
    public void ExecuteNonQuery(string strSQL)
    {
        DbCommand dbCommand = _dbConnection.CreateCommand();
        dbCommand.CommandText = strSQL;
        dbCommand.Prepare();
        dbCommand.ExecuteNonQuery();
    }
    #endregion

任何人都可以从提供的内容中看到任何可以改进的地方吗?后台工作人员是否有可以设置为更快工作的设置?后台工作人员是否有默认设置?db 文件中有哪些可以更改(使用 SQLite Expert Personal)以加快插入速度的设置?它只是我文件的大小吗?现在,当我完成这个时,它刚刚过去了 22 分钟,并且完成了 24,000 条记录。这不是一个时间敏感的问题,所以请花点时间。谢谢。

更新:另外,我想我应该提到在其中一个表上我有一个整数主键(充当身份字段)。这可能有任何性能问题吗?

4

4 回答 4

5

SQLiteTransaction在整个插入物周围使用一个。照原样,它将在每次插入后强制刷新到文件以保持 ACID 合规性。与任何DbConnectionand一样DbTransaction,您使用BeginTransaction, 完成后,Commit. 整个插入会成功或失败,它会有更好的性能。

于 2010-08-31T18:19:42.887 回答
1

The number one thing that will increase insert performance is to only begin a single transaction. It will result in an orders-of-magnitude speedup for your inserts.

See here for the FAQ entry that describes this phenomenon.

于 2010-08-31T18:22:09.270 回答
1

FWIW, the command-line client for SQLite has a data-loading builtin command. But if you read the C code of that SQLite client, you'd see it doesn't do anything special. It just reads your data file line by line and executes INSERT in a loop.

Other answers have suggested using explicit transactions so you can avoid overhead of I/O flushing after each row. I agree with that advice, it will certainly have a huge benefit.

You can also disable the rollback journal:

PRAGMA journal_mode = OFF

Or set writes to asynchronous, allowing the operating system to buffer I/O:

PRAGMA synchronous = OFF

These pragma changes should save significant I/O overhead. But without a rollback journal, the ROLLBACK command won't work, and if your application crashes during an in-progress transaction, your database might be corrupted. Without synchronous writes, an operating system failure could also result in lost data.

Not trying to scare you, but you should know that there is a tradeoff between performance and guaranteed I/O integrity. I recommend operating with safety modes enabled most of the time, and disable them only briefly when you need to do a big data load like you're doing -- then remember to re-enable the safety modes!

于 2010-08-31T18:38:57.950 回答
0

我一直在试验,发现用 C# 将大型数据库导入 SQLite 的最快方法实际上是转储到 csv 然后使用命令行 sqlite3.exe 工具

在我的笔记本电脑上插入一个包含大约 2500 万行的大文件

使用 .NET 包装器优化插入:30 分钟(使用事务、参数化命令、日志关闭等进行优化)

转储到 CSV(2 分钟)然后使用 sqllite3.exe 导入 CSV(5 分钟):7 分钟

于 2012-09-05T17:04:16.370 回答