3

I need to match all three types of comments that PHP might have:

  • # Single line comment

  • // Single line comment

  • /* Multi-line comments */

  •  

     /**
      * And all of its possible variations
      */
    

Something I should mention: I am doing this in order to be able to recognize if a PHP closing tag (?>) is inside a comment or not. If it is then ignore it, and if not then make it count as one. This is going to be used inside an XML document in order to improve Sublime Text's recognition of the closing tag (because it's driving me nuts!). I tried to achieve this a couple of hours, but I wasn't able. How can I translate for it to work with XML?

So if you could also include the if-then-else login I would really appreciate it. BTW, I really need it to be in pure regular expression expression, no language features or anything. :)

Like Eicon reminded me, I need all of them to be able to match at the start of the line, or at the end of a piece of code, so I also need the following with all of them:

<?php
    echo 'something'; # this is a comment
?>
4

2 回答 2

10

解析编程语言对于正则表达式来说似乎太多了。您可能应该寻找一个 PHP 解析器。

但这些将是您正在寻找的正则表达式。我假设他们所有人都使用 DOTALL 或 SINGLELINE 选项(尽管前两个在没有它的情况下也可以工作):

~#[^\r\n]*~
~//[^\r\n]*~
~/\*.*?\*/~s

请注意,如果注释分隔字符出现在字符串或其他地方,这些字符中的任何一个都会导致问题,而它们实际上并没有打开注释。

您还可以将所有这些组合成一个正则表达式:

~(?:#|//)[^\r\n]*|/\*.*?\*/~s

如果您使用一些不需要分隔符的工具或语言(如 Java 或 C#),请删除那些~. 在这种情况下,您还必须以不同的方式应用 DOTALL 选项。但是在不知道您将在哪里使用它的情况下,我无法告诉您如何使用。

如果您不能/不想设置 DOTALL 选项,这将是等效的(我还省略了分隔符来举例):

(?:#|//)[^\r\n]*|/\*[\s\S]*?\*/

有关工作演示,请参见此处。

现在,如果您还想捕获组中评论的内容,那么您可以这样做

(?|(?:#|//)([^\r\n]*)|/\*([\s\S]*?)\*/)

无论评论的类型如何,评论内容(没有语法分隔符)都将在捕获 1 中找到。

另一个工作演示

于 2012-10-28T23:34:25.750 回答
0

单行注释

singleLineComment = /'[^']*'|"[^"]*"|((?:#|\/\/).*$)/gm

使用此正则表达式,您必须替换(或删除)由((?:#|\/\/).*$). 此正则表达式将忽略看起来像注释的字符串内容(例如$x = "You are the #1";or $y = "You can start comments with // or # in PHP, but I'm a code string";

多行注释

 multilineComment = /^\s*\/\*\*?[^!][.\s\t\S\n\r]*?\*\//gm
于 2016-10-25T14:06:49.277 回答