c# - C#中文件读取的优化

Question

我需要复制一个文件，解析它的内容，删除换行符，沿着管道分割内容，然后将生成的 string[] 存储在数据库中。我的文件每个可以有 65000 条有效记录，因此性能至关重要。

以下是我目前拥有的。问题是它非常慢（处理 65000 行需要 3 小时）。我将不胜感激改进优化这件作品的任何帮助，以便我的运行可以明显更快。

public void ReadFileLinesIntoRows()
    {
        try
        {
            using (var reader = new TextFieldParser(FileName))
            {
                reader.HasFieldsEnclosedInQuotes = false;
                reader.TextFieldType = FieldType.Delimited;
                reader.SetDelimiters("|");
                String[] currentRow;
                while (!reader.EndOfData)
                {
                    try
                    {
                        currentRow = reader.ReadFields();
                        int rowcount = currentRow.Count();
                        //if it is less than what you need, pad it.  
                        if (rowcount < 190)
                        {
                            Array.Resize<string>(ref currentRow, 190);  
                            rows.Add(currentRow);
                        }
                        else
                        {
                            rows.Add(currentRow);
                        }
                    }
                    catch (MalformedLineException mex)
                    {
                        unreadlines.Add(reader.ErrorLine);//continue afterwards
                    } 
                }
                this.TotalRowCount  = rows.Count();
            }
        }
        catch (Exception ex)
        {

            throw ex;
        }
    }

   public void cleanfilecontent(String tempfilename, Boolean? HeaderIncluded)
    {
        try
        { 
            //remove the empty lines in the file 
            using (var sr = new StreamReader(tempfilename))
            {
                // Write new file
                using (var sw = new StreamWriter(CleanedCopy))
                {
                    using (var smove = new StreamWriter(duptempfileremove))
                    {
                        string line;

                        Boolean skippedheader = false;
                        while ((line = sr.ReadLine()) != null)
                        {
                            // Look for text to remove
                            if (line.Contains("----------------------------------"))
                            { 
                                smove.Write(line);
                            }
                            else if (HeaderIncluded.HasValue && HeaderIncluded.Value==true && ! skippedheader)
                            {
                                smove.Write(line);
                                skippedheader = true;
                            }
                            else if(skippedheader)
                            {
                                // Keep lines that does not match
                                sw.WriteLine(line);
                            } 
                        }
                        smove.Flush();
                    }
                    sw.Flush();
                }
                sr.Close();
            }
        }
        catch (Exception ex)
        {

            throw ex;
        }

    }

score 0 · Accepted Answer

65000 条记录并不是那么大。如果您有足够的内存，我建议您将整个文件读入内存，执行解析并构造您的行并使用批量插入将数据提交到数据库。这将是最快的！我怀疑您 3 个小时的大部分时间都花在了一次将数据库记录插入数据库中。

c# - C#中文件读取的优化

1 回答 1

Related

Reference