python - 在句子中查找短语

Question

我必须制作一个程序，该程序必须在具有用户指定关键字的句子中找到一个短语：

The Large Hadron Collider (LHC) is the world’s largest and most powerfulparticle accelerator.This site includes the latest news from the project, accessible explanations of how the LHC works, how it is funded, who works there and what benefits it brings us.You can access a wide range of resources for the public, journalists and teachers and students, there are also many links to other sources of information.The Large Hadron Collider atCERNnear Geneva, Switzerland is opening new vistas on the deepest secrets of the universe, stretching the imagination with newly discovered forms of matter, forces of nature, and dimensions of space.

用户指定：

['large', 'big', 'heavy']

我不确定如何在变量中的关键字之前和之后拿起几个词，例如：

keyword = 'large'

它必须返回

The Large Hadron

as large 出现在句子中。如何在句子中的任何变量之前放置一个单词和在任何变量之后放置一个单词？

score 3 · Accepted Answer

test_word = 'large'
my_string = 'The Large Hadron Collider (LHC) is the world’s largest and most powerfulparticle accelerator.This site includes the latest news from the project, accessible explanations of how the LHC works, how it is funded, who works there and what benefits it brings us' 
# I truncated your sentence

test_words = my_string.lower().split()
correct_case = my_string.split() # this will preserve the case of the original words
# and it will be identical in length to test words with each word in the same position
position = test_words.index(test_word)

my_new_string = ' '.join(correct_case[position-1:position+2]

To be clear the two lists have the same words, the test_words list though keeps everything in lower case but your test_word will be in the same position in each list so you can use the position in the test_word list to pull the correct words from the correct_case list.

score 0 · Accepted Answer

What about using index to get the position of the keyword, then slicing the string one word on either side of the keyword.

In [1]: s = 'The Large Hadron Collider (LHC) is the world’s largest and most powerfulparticle accelerator.'
In [2]: words = s.split() 
In [3]: words_lower = s.lower().split() #lowercase words so keyword matching is easy.
In [4]: keyword = 'large'
In [5]: i = words_lower.index(keyword)
In [6]: phrase = ' '.join(words[i-1:i+2])
In [7]: phrase
Out[7]: 'The Large Hadron'

score 0 · Accepted Answer

text = "The Large Hadron Collider (LHC) is the world’s largest and most powerfulparticle accelerator.This site includes the latest news from the project, accessible explanations of how the LHC works, how it is funded, who works there and what benefits it brings us.You can access a wide range of resources for the public, journalists and teachers and students, there are also many links to other sources of information.The Large Hadron Collider atCERNnear Geneva, Switzerland is opening new vistas on the deepest secrets of the universe, stretching the imagination with newly discovered forms of matter, forces of nature, and dimensions of space."
keywords = ['large', 'is', 'most']
text = text.lower().split(' ')
results = []
for word in keywords:
    indx = text.index(word)
    results.append(" ".join(text[indx-1:indx+2]))

print results

python - 在句子中查找短语

3 回答 3

Related

Reference