0
 s="""04-09 11:11:57.879 D/PTT [STACK]( 1653): *********Sending request
    04-09 11:11:57.879 [STACK]( 1653): *********
    04-09 11:11:57.879 [STACK]( 1653): S: abcd 
    04-09 11:11:57.879 [STACK]( 1653): l: jockey
    04-09 11:11:57.879 [STACK]( 1653): k: sucess
    04-09 11:11:57.879 [STACK]( 1653): j: 82
    04-09 11:11:57.879 [STACK]( 1653): 
    04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS
    04-09 11:11:57.879 [STACK]( 1653): Not doing anything
    04-09 11:11:57.879 [STACK]( 1653): Not doing anything
    04-09 11:11:57.879 [STACK]( 1653): Not doing anything
    04-09 11:11:57.879 D/PTT [STACK]( 1653): *********Sending request
    04-09 11:11:57.879 [STACK]( 1653): *********
    04-09 11:11:57.879 [STACK]( 1653): S: abcd 
    04-09 11:11:57.879 [STACK]( 1653): l: Donald
    04-09 11:11:57.879 [STACK]( 1653): k: sucess
    04-09 11:11:57.879 [STACK]( 1653): j: 83
    04-09 11:11:57.879 [STACK]( 1653): 
    04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS
    04-09 11:11:57.879 [STACK]( 1653): Not doing anything
    04-09 11:11:57.879 [STACK]( 1653): Not doing anything
    04-09 11:11:57.879 [STACK]( 1653): Not doing anything
    04-09 11:11:57.879 D/PTT [STACK]( 1653): *********Sending request
    04-09 11:11:57.879 [STACK]( 1653): *********
    04-09 11:11:57.879 [STACK]( 1653): S: abcd 
    04-09 11:11:57.879 [STACK]( 1653): l: Mickey
    04-09 11:11:57.879 [STACK]( 1653): k: sucess
    04-09 11:11:57.879 [STACK]( 1653): j: 84
    04-09 11:11:57.879 [STACK]( 1653): 
    04-09 11:11:57.879 [STACK]( 1653): 
    04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS
    04-09 11:11:57.879 D/PTT [STACK]( 1653): *********Sending request
    04-09 11:11:57.879 [STACK]( 1653): *********
    04-09 11:11:57.879 [STACK]( 1653): S: abcd 
    04-09 11:11:57.879 [STACK]( 1653): l: Donald
    04-09 11:11:57.879 [STACK]( 1653): k: sucess
    04-09 11:11:57.879 [STACK]( 1653): j: 83
    04-09 11:11:57.879 [STACK]( 1653): 
    04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS
    04-09 11:11:57.879 D/PTT [STACK]( 1653): *********Sending request
    04-09 11:11:57.879 [STACK]( 1653): *********
    04-09 11:11:57.879 [STACK]( 1653): S: abcd 
    04-09 11:11:57.879 [STACK]( 1653): l: jockey
    04-09 11:11:57.879 [STACK]( 1653): k: sucess
    04-09 11:11:57.879 [STACK]( 1653): j: 82
    04-09 11:11:57.879 [STACK]( 1653): 
    04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS"""

    exepat= re.compile(".*Sending request.*?Donald.*?TO BE SENT IS",re.DOTALL)

    reout = exepat.findall(s)

    print reout[0]

Expected Output:
    04-09 11:11:57.879 D/PTT [STACK]( 1653): *********Sending request
    04-09 11:11:57.879 [STACK]( 1653): *********
    04-09 11:11:57.879 [STACK]( 1653): S: abcd 
    04-09 11:11:57.879 [STACK]( 1653): l: Donald
    04-09 11:11:57.879 [STACK]( 1653): k: sucess
    04-09 11:11:57.879 [STACK]( 1653): j: 83
    04-09 11:11:57.879 [STACK]( 1653): 
    04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS

我需要一种模式来提取在“发送请求”和“要发送的消息是”之间具有“唐纳德”的请求。在上面的示例中,两个请求包含“唐纳德”。所以 reout 列表应该有 2 个项目。

4

1 回答 1

2

你正在寻找re.DOTALL

re.MULTILINE需要改变行首和行尾锚的行为^$re.DOTALL可以.匹配换行符。

re.M
re.MULTILINE
指定时,模式字符'^'匹配字符串的开头和每行的开头(紧跟在每个换行符之后);并且模式字符'$' 在字符串的末尾和每行的末尾(紧接在每个换行符之前)匹配。默认情况下,'^'仅匹配字符串的开头、字符串'$'的结尾以及字符串末尾的换行符(如果有)之前。

re.S
re.DOTALL
使'.'特殊字符完全匹配任何字符,包括换行符;没有这个标志,'.'将匹配除换行符以外的任何内容。

re.DOTALL,我得到:

>>> exepat= re.compile(r"Sending request.*TO BE SENT IS", re.DOTALL)
>>> reout = exepat.search(s)
>>> print reout
<_sre.SRE_Match object at 0x10a729370>
>>> print reout.group()
Sending request
04-09 11:11:57.879 [STACK]( 1653): *********
04-09 11:11:57.879 [STACK]( 1653): S: abcd 
04-09 11:11:57.879 [STACK]( 1653): l: jockey
04-09 11:11:57.879 [STACK]( 1653): k: sucess
04-09 11:11:57.879 [STACK]( 1653): j: 82
04-09 11:11:57.879 [STACK]( 1653): 
04-09 11:11:57.879 [STACK]( 1653): 
04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS

如果您有多个此类消息,则需要使用非贪婪*?匹配:

exepat = re.compile(r"Sending request.*?TO BE SENT IS", re.DOTALL)

注意问号;它指示乘数匹配满足模式的最少字符数,而不是最多。

然后使用.findall()我们在更新的示例中找到 3 个匹配项而不是 1 个匹配项:

>>> exepat = re.compile(r"Sending request.*?TO BE SENT IS", re.DOTALL)
>>> exepat.findall(s)
['Sending request\n04-09 11:11:57.879 [STACK]( 1653): *********\n04-09 11:11:57.879 [STACK]( 1653): S: abcd \n04-09 11:11:57.879 [STACK]( 1653): l: jockey\n04-09 11:11:57.879 [STACK]( 1653): k: sucess\n04-09 11:11:57.879 [STACK]( 1653): j: 82\n04-09 11:11:57.879 [STACK]( 1653): \n04-09 11:11:57.879 [STACK]( 1653): \n04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS', 'Sending request\n04-09 11:11:57.879 [STACK]( 1653): *********\n04-09 11:11:57.879 [STACK]( 1653): S: abcd \n04-09 11:11:57.879 [STACK]( 1653): l: jockey\n04-09 11:11:57.879 [STACK]( 1653): k: sucess\n04-09 11:11:57.879 [STACK]( 1653): j: 83\n04-09 11:11:57.879 [STACK]( 1653): \n04-09 11:11:57.879 [STACK]( 1653): \n04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS', 'Sending request\n04-09 11:11:57.879 [STACK]( 1653): *********\n04-09 11:11:57.879 [STACK]( 1653): S: abcd \n04-09 11:11:57.879 [STACK]( 1653): l: jockey\n04-09 11:11:57.879 [STACK]( 1653): k: sucess\n04-09 11:11:57.879 [STACK]( 1653): j: 84\n04-09 11:11:57.879 [STACK]( 1653): \n04-09 11:11:57.879 [STACK]( 1653): \n04-09 11:11:57.879 [STACK]( 1653): MESSAGE TO BE SENT IS']
>>> len(exepat.findall(s))
3
于 2013-04-09T14:27:25.763 回答