1

我为数组交集编写了一个简短的函数,想知道为什么一个函数比另一个函数快。

1)

Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length

Function newintersect(ByRef list1() As String) As String()
    Dim intersection As New ArrayList
    If (list1.Length < list2length) Then
        'use list2'
        For Each thing As String In list2
            If (Array.IndexOf(list1, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    Else
        'use list1'
        For Each thing As String In list1
            If (Array.IndexOf(list2, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    End If
    Return intersection
End Function

2)

Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length

Function newintersect(ByRef list1() As String) As String()
    Dim intersection As New ArrayList
    If (list1.Length > list2length) Then 'changed >'
        'use list2'
        For Each thing As String In list2
            If (Array.IndexOf(list1, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    Else
        'use list1'
        For Each thing As String In list1
            If (Array.IndexOf(list2, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    End If
    Return intersection
End Function

3)

Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length

Function newintersect(ByRef list1() As String) As String()
    For Each thing As String In list1
        If (Array.IndexOf(list2, thing) <> -1) Then
            intersection.Add(thing)
        End If
    Next
    Return intersection
End Function

所以对于我的测试用例,1 需要 65 秒,3 需要 63 秒,而 2 实际上需要 75 秒。有人知道为什么 3 是最快的吗?为什么 1 比 2 快?

(抱歉格式不好...似乎无法正确粘贴)

4

3 回答 3

1

这没什么区别。此外,这些方法似乎不会产生相同的结果,所以比较性能是没有意义的,对吧?

无论如何,Array.IndexOf查找项目的方法不是很有效,而且扩展性不好。如果您使用基于哈希键的集合作为查找,您应该会获得显着的改进,如下所示:

Function newintersect(ByRef list1 As String(), ByRef list2 As String()) As String()
  Dim smaller As HashSet(Of String)
  Dim larger As String()
  If list1.Length < list2.Length Then
    smaller = New HashSet(Of String)(list1)
    larger = list2
  Else
    smaller = New HashSet(Of String)(list2)
    larger = list1
  End If
  Dim intersection As New List(Of String)
  For Each item As String In larger
    If smaller.Contains(item) Then
      intersection.Add(item)
    End If
  Next
  Return intersection.ToArray()
End Function
于 2010-09-08T22:52:31.737 回答
0

我希望您会发现,使用不同的测试用例,您可以反转上面的结果并达到 2 最快而 1 和 3 更慢的情况。

在不知道测试用例的构成的情况下很难发表评论,这将取决于两个数组中“相交”项的位置——如果这些项在一个数组上趋向于靠近前面并且更靠近末尾另一个则数组迭代/ IndexOf 的嵌套顺序将具有明显不同的性能。

顺便说一句 - 有更好的方法来执行交集 - 对一个或其他数组进行排序并执行 BinarySearch 是一种方法 - 使用 Dictionary(Of String, ...) 或类似方法是另一种方法 - 两者都会导致更好的性能。

于 2010-09-08T22:31:00.523 回答
0

这是来自 MSDN 文档

    Dim id1() As Integer = {44, 26, 92, 30, 71, 38}
    Dim id2() As Integer = {39, 59, 83, 47, 26, 4, 30}

    ' Find the set intersection of the two arrays.
    Dim intersection As IEnumerable(Of Integer) = id1.Intersect(id2)

    For Each id As Integer In intersection
        Debug.WriteLine(id.ToString)
    Next
于 2010-09-09T11:43:15.993 回答