我需要在一个字符串中搜索几个不同匹配项的列表,假设我有这个列表:
['this', 'is', 'a', 'regex', 'test']
我想看看这些项目中的任何一个是否在一个字符串中,无论是使用正则表达式还是 Python 中的任何其他方法。
我首先尝试了做string in list
,但事实证明这是不够的,所以我尝试在正则表达式中连接条件,例如:
(this|is)(a|regex)(text)
但这会尝试匹配几个项目,就好像它们是串联的一样。
您可以使用内置功能any()
:
In [1]: strs="I am a string"
In [2]: lis=['this', 'is', 'a', 'regex', 'test']
In [3]: any(x in strs for x in lis)
Out[3]: True
这也将返回True
类似的东西"thisisafoobar"
。
但是,如果您想匹配确切的单词,请尝试re.search()
or str.split()
:
In [4]: import re
In [5]: any(re.search(r"\b{0}\b".format(x),strs) for x in lis)
Out[5]: True
In [6]: strs="foo bar"
In [7]: any(re.search(r"\b{0}\b".format(x),strs) for x in lis)
Out[7]: False
使用str.split()
:
In [12]: strs="I am a string"
In [13]: spl=strs.split() #use set(strs.split()) if the list returned is huge
In [14]: any(x in spl for x in lis)
Out[14]: True
In [15]: strs="Iamastring"
In [16]: spl=strs.split()
In [17]: any(x in spl for x in lis)
Out[17]: False
>>> l = ['this', 'is', 'a', 'regex', 'test']
>>> s = 'this is a test string'
>>> def check(elements, string):
... for element in elements:
... if element in string:
... return True
... return False
...
>>> check(l, s)
True
显然这个函数比any()
import time
def main():
# Making a huge list
l = ['this', 'is', 'a', 'regex', 'test'] * 10000
s = 'this is a test string'
def check(elements, string):
for element in elements:
if element in string:
return True
return False
def test_a(elements, string):
"""Testing check()"""
start = time.time()
check(elements, string)
end = time.time()
return end - start
def test_b(elements, string):
"""Testing any()"""
start = time.time()
any(element in string for element in elements)
end = time.time()
return end - start
print 'Using check(): %s' % test_a(l, s)
print 'Using any(): %s' % test_b(l, s)
if __name__ == '__main__':
main()
结果:
pearl:~ pato$ python test.py
Using check(): 3.09944152832e-06
Using any(): 5.96046447754e-06
pearl:~ pato$ python test.py
Using check(): 1.90734863281e-06
Using any(): 7.15255737305e-06
pearl:~ pato$ python test.py
Using check(): 2.86102294922e-06
Using any(): 6.91413879395e-06
但是,如果您将any()
with 与map()
类似的东西结合使用any(map(lambda element: element in string, elements))
,则结果如下:
pearl:~ pato$ python test.py
Using check(): 3.09944152832e-06
Using any(): 0.00903916358948
pearl:~ pato$ python test.py
Using check(): 2.86102294922e-06
Using any(): 0.00799989700317
pearl:~ pato$ python test.py
Using check(): 3.09944152832e-06
Using any(): 0.00829982757568
你可以这样做:
if any(test in your_string for test in tests):
...