假设我有一段。我通过 sent_tokenize 将其分成句子:
variable = ['By the 1870s the scientific community and much of the general public had accepted evolution as a fact.',
'However, many favoured competing explanations and it was not until the emergence of the modern evolutionary synthesis from the 1930s to the 1950s that a broad consensus developed in which natural selection was the basic mechanism of evolution.',
'Darwin published his theory of evolution with compelling evidence in his 1859 book On the Origin of Species, overcoming scientific rejection of earlier concepts of transmutation of species.']
现在我将每个句子分成单词并将其附加到某个变量中。我怎样才能找到具有最多相同单词的两组句子。我不知道该怎么做。如果我有 10 个句子,那么我将有 90 个检查(每个句子之间)。谢谢。