天真的方法可能非常简单 - 使用拆分字符串方法,例如
sentences = ["I want to learn about linear regression", "I want to read about SVM", "I want to go to Python 2.6",
"Take me to logistic regression: eval"]
split_terms = ["about", "go to", "learn"]
for sentence in sentences:
for split_term in split_terms:
try:
print(sentence.split(split_term)[1])
except IndexError:
pass # split_term was not found in a sentence
结果:
linear regression
about linear regression
SVM
Python 2.6
一个更聪明的方法可能是首先找到最后一个“拆分术语”来解决问题,学习 - 了解 - 关于
for sentence in sentences:
last_split_term_index = 0
last_split_term = ""
for split_term in split_terms:
last_split_term_index_candidate = sentence.find(split_term)
if last_split_term_index_candidate > last_split_term_index:
last_split_term_index = last_split_term_index_candidate
last_split_term = split_term
try:
print(sentence.split(last_split_term)[1])
except:
continue
结果:
linear regression
SVM
Python 2.6