python - 如何使用南非荷兰语语言单词作为 nltk 语料库访问文本文件

Question

我有一个带有南非荷兰语纯文本句子的文本文件。我希望能够在此文本文件上执行 nltk 语料库功能，但找不到任何有关如何执行此操作的示例。

我想做一些事情，例如：

mytext.concordance("woord")
mytext.similar("woord")

谁能帮我？

score 1 · Accepted Answer

设法弄清楚了一些事情：

# How to load a text file as a corpus.
import nltk
from nltk.corpus import PlaintextCorpusReader
from nltk.corpus.util import LazyCorpusLoader
afrikaans = LazyCorpusLoader('afrikaans', PlaintextCorpusReader, r'(?!\.).*\.txt')
afrikaans.sents()[1]
af = nltk.Text(afrikaans.words())
af.concordance("mense")

这假设您的语料库文本文件位于 C:\nltk_data\corpora\afrikaans\afrikaans.txt

python - 如何使用南非荷兰语语言单词作为 nltk 语料库访问文本文件

1 回答 1

Related

Reference