4

我有一个导致多个匹配的正则表达式。一个示例数据集是一个 CSV 文件,每一行都是一个单独的匹配项:

product,color,type,shape,size
apple,green,fruit,round,large
banana,yellow,fruit,long,large
cherry,red,fruit,round,small

所以第 1 场比赛是苹果、绿色、水果、圆形、大号,第 2 场比赛是香蕉、黄色、水果、长、大号等。

所以我的问题是,在使用 RegEx.Replace 时,如何指定“开始”匹配(例如,在这种情况下,我想从第二个匹配开始),以及之后如何指定匹配的数量?这只是一个示例,在其他情况下,我想从匹配 #4 等开始。

看起来RegEx.Replace支持这样的东西,但我正在寻找一个适用于我的场景的更好的例子。

我努力了:

Dim r As New RegEx(pattern)
result = r.Replace(input, replace, 1, 2)

replace 是一个包含捕获值的字符串(在我的情况下为 $1),但我没有看到任何不同,仍然在 1 个字符串中获得所有匹配项。

有什么建议么?我希望可能像获取匹配数一样简单,只需使用 For 循环。

4

3 回答 3

1

我不会使用正则表达式来识别文本中的行。读取 CSV 文件

Dim lines As String()

lines = File.ReadAllLines("path of the CSV file")

然后像这样循环

For i As Integer = starting_match To last_match
    lines(i) = lines(i).Replace("old","new")
Next

并将这些线条与

Dim result As String
result = String.Join(System.Environment.NewLine, lines)

更新

混淆来自于Replace方法中的起始位置表示起始字符位置而不是起始匹配索引的事实。因此我建议使用这种扩展方法

<System.Runtime.CompilerServices.Extension> _
Public Shared Function ReplaceMatches(regex As Regex,
                                      input As String, replacement As String, 
                                      countMatches As Integer, startAtMatch As Integer
                                     ) As String
    Dim matches As MatchCollection = regex.Matches(input)
    If startAtMatch >= matches.Count Then
        Return input
    End If
    Dim skippedMatch As Match = matches(startAtMatch - 1)
    Dim startAtCharacterPosition As Integer = skippedMatch.Index + skippedMatch.Length
    Return regex.Replace(input, replacement, countMatches, startAtCharacterPosition)
End Function

现在您可以替换为:

Dim input As String = "aaa bbb ccc ddd eee fff"
Dim startAtMatch As Integer = 2 ' ccc
Dim countMatches As Integer = 3

Dim regex = New Regex("\w+")
Dim result As String = regex.ReplaceMatches(input, "XX", countMatches, startAtMatch)
Console.WriteLine(result) ' --> "aaa bbb XX XX XX fff"

(使用devloperFusion从 C# 转换为 VB 的示例)

于 2012-12-19T19:10:14.497 回答
1

看看Regex.Replace(string, string, MatchEvaluator)

http://msdn.microsoft.com/en-us/library/ht1sxswy.aspx

这应该允许你传递一个 MatchEvaluator 来检查特定匹配的索引,所以在这种情况下你可以寻找index == 1

于 2012-12-19T18:50:41.623 回答
-2

以下代码可能会对您有所帮助

http://msdn.microsoft.com/en-us/library/ms149475.aspx?cs-save-lang=1&cs-lang=vb#code-snippet-3

Imports System.Collections

Imports System.Text.RegularExpressions

Module Example

    Public Sub Main()
        Dim words As String = "letter alphabetical missing lack release " + _
                              "penchant slack acryllic laundry cease"
        Dim pattern As String = "\w+  # Matches all the characters in a word."
        Dim evaluator As MatchEvaluator = AddressOf WordScrambler
        Console.WriteLine("Original words:")
        Console.WriteLine(words)
        Console.WriteLine("Scrambled words:")
        Console.WriteLine(Regex.Replace(words, pattern, evaluator,
                                        RegexOptions.IgnorePatternWhitespace))
    End Sub

    Public Function WordScrambler(ByVal match As Match) As String
        Dim arraySize As Integer = match.Value.Length - 1
        ' Define two arrays equal to the number of letters in the match. 
        Dim keys(arraySize) As Double
        Dim letters(arraySize) As Char

        ' Instantiate random number generator' 
        Dim rnd As New Random()

        For ctr As Integer = 0 To match.Value.Length - 1
            ' Populate the array of keys with random numbers.
            keys(ctr) = rnd.NextDouble()
            ' Assign letter to array of letters.
            letters(ctr) = match.Value.Chars(ctr)
        Next
        Array.Sort(keys, letters, 0, arraySize, Comparer.Default)
        Return New String(letters)
    End Function

End Module

' The example displays output similar to the following: 
'    Original words: 
'    letter alphabetical missing lack release penchant slack acryllic laundry cease 
'     
'    Scrambled words: 
'    etlert liahepalbcat imsgsni alkc ereelsa epcnnaht lscak cayirllc alnyurd ecsae
于 2012-12-19T19:12:24.037 回答