0

Python 正则表达式 我有一个包含关键字的字符串,但有时这些关键字不存在,而且它们没有任何特定的顺序。我需要正则表达式方面的帮助。

关键字是:

Up-to-date
date added
date trained

这些是我需要在许多其他关键字中找到的关键字,它们可能不存在并且将按任何顺序排列。

刺痛是什么样子的

<div>
<h2 class='someClass'>text</h2>

 blah blah blah Up-to-date blah date added blah

</div>

我试过的:

regex = re.compile('</h2>.*(Up\-to\-date|date\sadded|date\strained)*.*</div>') 

regex = re.compile('</h2>.*(Up\-to\-date?)|(date\sadded?)|(date\strained?).*</div>')

re.findall(regex,string) 

我正在寻找的结果是:

If all exists
['Up-to-date','date added','date trained']

If some exists
['Up-to-date','','date trained']
4

2 回答 2

0

它必须是正则表达式吗?如果没有,您可以使用find

In [12]: sentence = 'hello world cat dog'

In [13]: words = ['cat', 'bear', 'dog']

In [15]: [w*(sentence.find(w)>=0) for w in words]
Out[15]: ['cat', '', 'dog']
于 2012-05-11T23:41:57.227 回答
0

这段代码做你想做的事,但它闻起来很臭:

import re

def check(the_str):
    output_list = []
    u2d = re.compile('</h2>.*Up\-to\-date*.*</div>') 
    da = re.compile('</h2>.*date\sadded*.*</div>')
    dt = re.compile('</h2>.*date\strained*.*</div>')
    if re.match(u2d, the_str):
        output_list.append("Up-to-date")
    if re.match(da, the_str):
        output_list.append("date added")
    if re.match(dt, the_str):
        output_list.append("date trained")

    return output_list

the_str = "</h2>My super cool string with the date added and then some more text</div>"
print check(the_str)
the_str2 = "</h2>My super cool string date added with the date trained and then some more text</div>"
print check(the_str2)
the_str3 = "</h2>My super cool string date added with the date trained and then Up-to-date some more text</div>"
print check(the_str3)
于 2012-05-11T23:46:50.990 回答