2

在 VS 2010 中,我有一个很大的字符串列表,列表中的每个项目也包含字符串列表(它不会再进一步​​了)。好消息是只会添加。不会从列表中删除任何内容。

我不想使用数据库。由于列表可能会变得很大,所以 XML 对我来说似乎很慢。我找不到任何适用于我的案例的通用解决方案。任何想法?

编辑:好的,我猜我的一些代码会让它更清楚。

Class Word
    Public theWord As String
    Public SubWords As New List(Of SubWord)
    Public Count As Integer = 1
    Sub New(ByRef Word As String)
        theWord = Word
    End Sub
    Public Sub AddSubWord(ByRef Word As String)
        Dim SubWordCount As Integer = SubWords.Count - 1
        Dim Found As Boolean
        For i = 0 To SubWordCount
            If SubWords(i).theWord = Word Then
                SubWords(i).Count += 1
                Found = True
                Exit For
            End If
        Next
        If Found = False Then
            SubWords.Add(New SubWord(Word))
        End If
    End Sub
    Public Overrides Function ToString() As String
        Return theWord
    End Function
End Class

Class SubWord
    Public theWord As String
    Public Count As Integer = 1
    Sub New(ByRef Word As String)
        theWord = Word
    End Sub
    Public Overrides Function ToString() As String
        Return theWord
    End Function
End Class

我的清单也是:

Dim Words As New List(Of Word)

目的是如果单词不在列表中,则将单词添加到列表中,如果不增加它的计数。子词也一样。稍后,所有列表将根据其计数进行排序。会有很多单词,每个单词都有一个巨大的子词列表。

4

1 回答 1

1

XML 似乎确实是最好的选择,但如果您真的关心效率,并且您确定数据结构将来不会改变,您可以简单地将数据存储在一个分隔的文本文件中。例如:

Private Sub SaveList(filePath As String, list As List(Of List(Of String)))
    Const fieldDelimiter As String = ","
    Const recordDelimiter As String = Environment.NewLine
    Dim temp As New List(Of String)()
    For each i as List(Of String) in list)
        temp.Add(String.Join(fieldDelimiter, i.ToArray()))
    Next
    Dim contents As String = String.Join(recordDelimiter, temp.ToArray())
    File.WriteAllText(filePath, contents)
End Sub

或者,更有效地:

Private Sub SaveList(filePath As String, list As List(Of List(Of String)))
    Const fieldDelimiter As String = ","
    Const recordDelimiter As String = Environment.NewLine
    Using writer As New StreamWriter(filePath)
        Dim firstRecord As Boolean = True
        For Each record as List(Of String) In list)
            If firstRecord Then
                firstRecord = False
            Else
                writer.Write(recordDelimiter)
            End If
            Dim firstField As Boolean = True
            For Each field As String In record
                If firstField Then
                    firstField = False
                Else
                    writer.Write(fieldDelimiter)
                End If
                writer.Write(field)
            Next
        Next
    End Using
End Sub    

这种方法的缺点是您需要确保您使用的分隔符永远不会出现在任何记录的任何字段中。如果您确定字符串永远不会包含某个不寻常的字符,那么您可以使用它。否则,另一种选择是逃避任何事件。因此,例如,如果您使用逗号作为分隔符,那么您需要替换所有出现的,with \,,然后还替换所有出现的\with \\。当然,这不仅会使您的保存逻辑复杂化,还会使您的加载逻辑复杂化。

更新

如果速度是您主要关心的问题,并且您可以保证 Words 和 Subwords 都小于 100 个字符,那么读取和写入数据的最快方法是将每个单词写入文本文件的新行,然后是每个子词使用固定宽度的字段。例如,如果您的最大长度为 5,则文件可能如下所示:

Word Sub1 Sub2
W2   SW1  SW2  SW3
W3
W4   SubWdSub2.

正如您在该示例中看到的那样,有四个单词(“Word”、“W2”、“W3”和“W4”),它们每个都有不同数量的子词。“Word”的子词是“Sub1”和“Sub2”。“W3”没有子词,W4 有 2 个(“SubWd”和“Sub2.”)。

因此,要写出该文件,您可以执行以下操作:

Private Sub SaveWords(filePath As String, words As List(Of Word))
    Const maxLength As Integer = 100
    Using writer As New StreamWriter(filePath)
        Dim firstWord As Boolean = True
        For Each w As Word in words
            If firstWord Then
                firstWord = False
            Else
                writer.WriteLine()
            End If
            writer.Write(w.theWord.PadRight(maxLength))
            For Each s As SubWord In w.SubWords
                writer.Write(s.theWord.PadRight(maxLength))
            Next
        Next
    End Using
End Sub
于 2012-11-17T18:59:58.680 回答