vb.net - linq submitchanges 内存不足

Question

我有一个包含大约 180,000 条记录的数据库。我正在尝试将 pdf 文件附加到这些记录中的每一个。每个 pdf 的大小约为 250 kb。但是，大约一分钟后，我的程序开始占用大约 GB 的内存，我必须停止它。我尝试这样做，以便在更新后删除对每个 linq 对象的引用，但这似乎没有帮助。我怎样才能明确参考？

谢谢你的帮助

Private Sub uploadPDFs(ByVal args() As String)
    Dim indexFiles = (From indexFile In dataContext.IndexFiles
                     Where indexFile.PDFContent = Nothing
                     Order By indexFile.PDFFolder).ToList
    Dim currentDirectory As IO.DirectoryInfo
    Dim currentFile As IO.FileInfo
    Dim tempIndexFile As IndexFile

    While indexFiles.Count > 0
        tempIndexFile = indexFiles(0)
        indexFiles = indexFiles.Skip(1).ToList
        currentDirectory = 'I set the directory that I need
        currentFile = 'I get the file that I need
        writePDF(currentDirectory, currentFile, tempIndexFile)
    End While
End Sub

Private Sub writePDF(ByVal directory As IO.DirectoryInfo, ByVal file As IO.FileInfo, ByVal indexFile As IndexFile)
    Dim bytes() As Byte
    bytes = getFileStream(file)
    indexFile.PDFContent = bytes
    dataContext.SubmitChanges()
    counter += 1
    If counter Mod 10 = 0 Then Console.WriteLine("     saved file " & file.Name & " at " & directory.Name)
End Sub


Private Function getFileStream(ByVal fileInfo As IO.FileInfo) As Byte()
    Dim fileStream = fileInfo.OpenRead()
    Dim bytesLength As Long = fileStream.Length
    Dim bytes(bytesLength) As Byte

    fileStream.Read(bytes, 0, bytesLength)
    fileStream.Close()

    Return bytes
End Function

score 4 · Accepted Answer

我建议您分批执行此操作，使用Take（在调用之前ToList）一次处理特定数量的项目。阅读（比如说）10，将它们全部PDFContent设置为，调用，然后重新开始。（我不确定您是否应该从那时开始新的，但这样做可能是最干净的。）SubmitChangesDataContext

顺便说一句，您读取文件内容的代码至少在几个方面被破坏了 - 但首先使用它会更简单File.ReadAllBytes。

此外，您处理列表逐渐缩小的方式确实效率低下 - 在获取 180,000 条记录后，您将构建一个包含 179,999 条记录的新列表，然后再构建一个包含 179,998 条记录的新列表等。

score 0 · Accepted Answer

好的。为了使用最少的内存，我们必须更新块中的数据上下文。我在下面放了一个示例代码。可能有语法错误，因为我使用记事本输入。

    Dim DB as YourDataContext = new YourDataContext
    Dim BlockSize as integer = 25
    Dim AllItems = DB.Items.Where(function(i) i.PDFfile.HasValue=False)

    Dim count = 0
    Dim tmpDB as YourDataContext = new YourDataContext


While (count < AllITems.Count)

    Dim _item = tmpDB.Items.Single(function(i) i.recordID=AllItems.Item(count).recordID)
    _item.PDF = GetPDF()

    Count +=1

    if count mod BlockSize = 0 or count = AllItems.Count then
        tmpDB.SubmitChanges()
         tmpDB =  new YourDataContext
           GC.Collect()
    end if

End While

为了进一步优化速度，您可以将记录 ID 作为匿名类型从 allitems 获取到数组中，并为该 PDF 字段设置 DelayLoading。

score 0 · Accepted Answer

DataContext 是否将 ObjectTrackingEnabled 设置为 true（默认值）？如果是这样，那么它将尝试记录它接触到的所有数据，从而阻止垃圾收集器收集任何数据。

如果是这样，您应该能够通过定期处理 DataContext 并创建一个新的，或者关闭对象跟踪来解决这种情况。

vb.net - linq submitchanges 内存不足

3 回答 3

Related

Reference