1

在这里帮助一个新手。我正在尝试检查论坛帖子中的重复内容。到目前为止,我已经使用 webclient 下载了源代码并尝试了 Regex 和 mshtml,但没有任何运气。我得到了带有 mshtml 的线条,但不是我想要的方式,这意味着我无法分离各个评论。我试图阅读的来源如下:

<p>
    Hey Alton!</p>
<p>
    I am facing this problem also but i have search on the internet for the solution. There are few things that we need to do to solve this problem.</p>
<p>
    First of all make sure that you have latest drivers for you Graphics Card.</p>

到目前为止我尝试过的代码

正则表达式:

    Dim r As New System.Text.RegularExpressions.Regex("<p> .* </p>")
    Dim matches As MatchCollection = r.Matches(result)
    For Each itemcode As Match In matches
        ListBox1.Items.Add(itemcode.ToString)
    Next
4

1 回答 1

0
Dim regexObj As New Regex("<p>(.+?)</p>", RegexOptions.Singleline)
Dim matchResults As Match = regexObj.Match(subjectString)
While matchResults.Success

matchResults = matchResults.NextMatch()
End While
于 2013-03-26T13:56:53.587 回答