我想在 doc2vec 中有短语,我使用 gensim.phrases。在 doc2vec 中,我们需要标记文档来训练模型,而我无法标记短语。我怎么能这样做?
这是我的代码
text = phrases.Phrases(text)
for i in range(len(text)):
string1 = "SENT_" + str(i)
sentence = doc2vec.LabeledSentence(tags=string1, words=text[i])
text[i]=sentence
print "Training model..."
model = Doc2Vec(text, workers=num_workers, \
size=num_features, min_count = min_word_count, \
window = context, sample = downsampling)