1

我正在尝试创建一个程序,该程序根据文件 A(已知良好)验证文件 B(潜在不良)的内容,并从潜在不良文件中删除每个已知良好行,只留下潜在不良行。我遇到的问题是每一行都包含一个时间戳。如何验证时间戳之后开始的行的内容?

IE 文件 A:

MSI (c) (74:80) [08:09:43:718]: Resetting cached policy values
MSI (c) (74:80) [08:09:43:718]: Machine policy value 'Debug' is 0
MSI (c) (74:80) [08:09:43:718]: ******* RunEngine:

与文件 B:

MSI (c) (E8:DC) [18:35:18:573]: Resetting cached policy values
MSI (c) (E8:DC) [18:35:18:573]: Machine policy value 'Debug' is 0
MSI (c) (E8:DC) [18:35:18:573]: ******* RunEngine:

这些都应该被认为是平等的。我没有一个不同的例子,但它基本上是一旦这些被删除后剩下的任何东西。

到目前为止我的代码:

Public Class Form1
Dim compto As New List(Of String)
Dim compfrom As New List(Of String)

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
    Standard("filea.LOG")
    Readfile("fileb.LOG")
    Writefile("difference.txt")
End Sub


Public Sub Standard(ByVal Path As String)
    Using r As StreamReader = New StreamReader(Path)
        Dim line As String = Nothing
        line = r.ReadLine
        Do While (Not line Is Nothing)
            line = r.ReadLine
            If Not compto.Contains(line) Then compto.Add(line)
        Loop
    End Using
End Sub

Public Sub Readfile(ByVal Path As String)
    Dim pattern As String = "{30}([A-Za-z0-9\-]+"
    Using r As StreamReader = New StreamReader(Path)
        Dim line As String = Nothing
        line = r.ReadLine
        Do While (Not line Is Nothing)
            line = r.ReadLine
            If Not compto.Contains(line) Then compfrom.Add(line)
        Loop
    End Using
End Sub

Public Sub Writefile(ByVal Path As String)
    Using sw As StreamWriter = New StreamWriter(Path)
        For Each line As String In compfrom
            sw.WriteLine(line)
            ListBox1.Items.Add(line)
        Next
    End Using
End Sub

End Class

到目前为止,这段代码将删除完全匹配,但这就是我卡住的地方。任何帮助都是极好的。

谢谢。

解决方案编辑:

Public Sub Writefile(ByVal Path As String)
    Dim GetLine As Func(Of String, String) = Function(line) Regex.Match(line, "\]: (.*)").Groups(1).Value
    Dim Diff As New HashSet(Of String)(File.ReadLines("filea.log").Select(GetLine))
    Diff.SymmetricExceptWith(File.ReadLines("fileb.log").Select(GetLine))
    Using sw As StreamWriter = New StreamWriter(Path)
        For Each line As String In Diff
            sw.WriteLine(String.Join("", line))
            ListBox1.Items.Add(String.Join("", line))
        Next
    End Using
End Sub
4

2 回答 2

2

基于这个链接,试试这个:

Dim GetLine As Func(Of String,String) = Function(line) Regex.Match(line,"\]: (.*)").Groups(1).Value

'IF the timestamp is always at the same position, it may be more efficient to 
'avoid regular expressions. YMMV
GetLine = Function(line) line.Substring(32)

Dim Diff = New HashSet(File.ReadLines("filea.LOG").Select(GetLine))
Diff.SymmetricExceptWith(File.ReadLines("fileb.LOG").Select(GetLine))
于 2013-08-29T23:56:02.583 回答
1

看起来您正在将每个唯一行与File A中的每个行进行比较,并且File B行标题MSI (c) (74:80) [08:09:43:718]:与此比较无关,并且它的长度是恒定的。

您可能会更改您的代码(4 个实例):

line = r.ReadLine

到:

line = r.ReadLine.Substring(32)

Substring()使用一个整数参数返回从指定字符位置开始的字符串的其余部分。

于 2013-08-30T00:14:46.357 回答