regex - 使用正则表达式从文件中删除注释

Question

我想编写一个程序，从文件中删除所有注释（从“//”开始直到行尾）。

我想使用正则表达式来做到这一点。

我试过这个：

    let mutable text = File.ReadAllText("C:\\a.txt")
    let regexComment = new Regex("//.*\\r\\n$") 
    text <- regexComment.Replace(text, "")
    File.WriteAllText("C:\\a.txt",text)

但它不起作用...

您能否向我解释为什么，并给我一些可行的建议（最好使用正则表达式..）？

谢谢：）

score 4 · Accepted Answer

与其将整个文件加载到内存中并在其上运行正则表达式，不如处理任何大小的文件而不会出现内存问题的更快方法可能如下所示：

open System
open System.IO
open System.Text.RegularExpressions

// regex: beginning of line, followed by optional whitespace, 
// followed by comment chars.
let reComment = Regex(@"^\s*//", RegexOptions.Compiled)

let stripComments infile outfile =
    File.ReadLines infile
    |> Seq.filter (reComment.IsMatch >> not)
    |> fun lines -> File.WriteAllLines(outfile, lines)


stripComments "input.txt" "output.txt"

输出文件必须与输入文件不同，因为我们正在写入输出，而我们仍在从输入读取。我们使用正则表达式来识别注释行（带有可选的前导空格），并Seq.filter确保注释行不会被发送到输出文件。

因为我们从不将整个输入或输出文件保存在内存中，所以此函数适用于任何大小的文件，并且它可能比“读取整个文件，正则表达式所有内容，写入整个文件”方法更快。

前方危险

此代码不会删除出现在同一行的某些代码之后的注释。但是，正则表达式不是该工作的正确工具，除非有人能想出一个正则表达式，它可以区分以下两行代码，并避免在您从文件中删除与正则表达式匹配的所有内容时破坏第一行：

let request = WebRequest.Create("http://foo.com")
let request = WebRequest.Create(inputUrl) // this used to be hard-coded

score 1 · Accepted Answer

1

let regexComment = new Regex(@"//.*$",RegexOptions.Multiline)

于 2012-06-01T10:11:53.820 回答

score 0 · Accepted Answer

0

没关系，我想通了。它应该是：

let regexComment = new Regex("//.*\\r\\n")

于 2012-06-01T08:14:30.513 回答

score 0 · Accepted Answer

0

您的正则表达式字符串似乎是错误的。 "\\/\\/.*\\r\\n"为我工作。

于 2012-06-01T08:14:46.780 回答

regex - 使用正则表达式从文件中删除注释

4 回答 4

Related

Reference