0

我目前正在尝试开发一个从某个网页获取数据的应用程序。

假设这个网页有以下内容:

<needle1>HAYSTACK 1<needle2>
<needle1>HAYSTACK 2<needle2>
<needle1>HAYSTACK 3<needle2>
<needle1>HAYSTACK 4<needle2>
<needle1>HAYSTACK 5<needle2>

我有以下 VB.NET 代码:

Dim webClient As New System.Net.WebClient
Dim FullPage As String = webClient.DownloadString("PAGE URL HERE")
Dim ExtractedInfo As String = GetBetween(FullPage, "<needle1>", "<needle2>")

GetBetween 是以下函数:

Function GetBetween(ByVal haystack As String, ByVal needle As String, ByVal needle_two As String) As String
    Dim istart As Integer = InStr(haystack, needle)
    If istart > 0 Then
        Dim istop As Integer = InStr(istart, haystack, needle_two)
        If istop > 0 Then
            Dim value As String = haystack.Substring(istart + Len(needle) - 1, istop - istart - Len(needle))
            Return value
        End If
    End If
    Return Nothing
End Function

通过使用提到的代码,ExtractedInfo 总是等于“HAYSTACK 1”,因为它总是从它找到的第一个匹配项中获取 haystack。

我的问题是:如何像某种数组一样设置 ExtractedInfo 以便它查找第二个、第三个、第四个等......出现。

就像是:

ExtractedInfo(1) = HAYSTACK 1
ExtractedInfo(2) = HAYSTACK 2

提前致谢!

4

1 回答 1

1

编辑:我认为这就是你真正要问的。您将为每组“针”调用一次 GetBetween 函数。

Dim webClient As New System.Net.WebClient
Dim FullPage As String = webClient.DownloadString("PAGE URL HERE")
Dim ExtractedInfo As List (Of String) = GetBetween(FullPage, "<needle1>", "<needle2>")

Function GetBetween(ByVal haystack As String, ByVal needle As String, ByVal needle2 As String) As List(Of String)
        Dim result As New List(Of String)
        Dim split1 As String() = Split(haystack, needle).ToArray
        For Each item In split1
            Dim split2 As String() = Split(item, needle2)
            Dim include As Boolean = True
            For Each element In split2
                If include Then
                    If String.IsNullOrWhiteSpace(element) = False Then result.Add(element)
                End If
                include = Not include
            Next element
        Next item

        Return result
End Function
于 2013-05-19T15:05:35.957 回答