0

I need to parse annotations of methods written in PHP. I wrote a regex (see simplified example below) to search them but it doesn't work as expected. Instead of matching the shortest part of text between /** and */, it matches the maximum amount of source code (previous methods with annotations). I'm sure I'm using the correct .*? non greedy version of * and I have found no evidence DOTALL turns it off. Where could be the problem, please? Thank you.

p = re.compile(r'(?:/\*\*.*?\*/)\n\s*public', re.DOTALL)
methods = p.findall(text)
4

4 回答 4

1

我想你正试图得到这个,

>>> text = """ /** * comment */ class MyClass extens Base { /** * comment */ public function xyz """
>>> m = re.findall(r'\/\*\*(?:(?!\*\/).)*\*\/\s*public', text, re.DOTALL)
>>> m
['/** * comment */ public']

如果您不想public在最后一场比赛中使用,请使用下面的正则表达式,该正则表达式使用正前瞻,

>>> m = re.findall(r'\/\*\*(?:(?!\*\/).)*\*\/(?=\s*public)', text, re.DOTALL)
>>> m
['/** * comment */']
于 2014-07-31T09:52:06.940 回答
0

You should be able to use this:

\/\*\*([^*]|\*[^/])*?\*\/\s*public

That will match any symbol that isn't an asterix (*), and if is an asterix it's not allowed to be followed by a forward slash. Meaning it should only capture comments that are closed just before public and not sooner.

Example: http://regexr.com/398b3

Explanation: http://tinyurl.com/lcewdmo

Disclaimer: If the comment contains */ inside it, this won't work.

于 2014-07-31T10:11:21.960 回答
0

正则表达式引擎从左到右解析。懒惰的量词会尝试从当前匹配位置开始尽可能少地匹配,但它不能将匹配开始向前推进,即使这会减少匹配的文本数量。这意味着不是从/**之前的最后一个开始public,而是从第一个匹配/**到下一个*/附加到 a public

如果您想*/从评论中排除,您需要.使用前瞻断言对 进行分组:

(?:(?!\*/).)

(?!\*/)断言我们匹配的字符不是序列的开始*/

于 2014-07-31T09:58:47.593 回答
0
# Some examples and assuming that the annotation you want to parse
# starts with a /** and ends with a */.  This may be spread over
# several lines.

text = """
/**
 @Title(value='Welcome', lang='en')
 @Title(value='Wilkommen', lang='de')
 @Title(value='Vitajte', lang='sk')
 @Snippet
    ,*/
class WelcomeScreen {}

   /** @Target("method") */
  class Route extends Annotation {}

/** @Mapping(inheritance = @SingleTableInheritance,
    columns = {@ColumnMapping('id'), @ColumnMapping('name')}) */
public Person {}

"""

text2 = """ /** * comment */
CLASS MyClass extens Base {

/** * comment */
public function xyz
"""


import re

# Match a PHP annotation and the word following class or public
# function.
annotations = re.findall(r"""/\*\*             # Starting annotation
                                               # 
                            (?P<annote>.*?)    # Namned, non-greedy match
                                               # including newline
                                               #
                             \*/               # Ending annotation
                                               #
                             (?:.*?)           # Non-capturing non-greedy
                                               # including newline
                 (?:public[ ]+function|class)  # Match either
                                               # of these
                             [ ]+              # One or more spaces
                             (?P<name>\w+)     # Match a word
                         """,
                         text + text2,
                         re.VERBOSE | re.DOTALL | re.IGNORECASE)

for txt in annotations:
     print("Annotation: "," ".join(txt[0].split()))
     print("Name: ", txt[1])
于 2014-07-31T11:39:15.913 回答