给定文本中单词的索引,我需要获取字符索引。例如,在下面的文本中:
"The cat called other cats."
单词“cat”的索引是 1。我需要 cat 的第一个字符的索引,即 c,它将是 4。我不知道这是否相关,但我正在使用 python-nltk 来获取单词。现在我能想到的唯一方法是:
- Get the first character, find the number of words in this piece of text
- Get the first two characters, find the number of words in this piece of text
- Get the first three characters, find the number of words in this piece of text
Repeat until we get to the required word.
但这将是非常低效的。任何想法将不胜感激。