0

我从我的语料库中训练了 word2vec 模型。

corpus = "fewdata.txt"
f = io.open(corpus, mode ="r", encoding = "utf-8")
#corpus1 = list(f) 
lines = f.readlines()
sentences =[]
for line in lines:
    mqul= line.split()
    #print(mqul)
    sentences.append(mqul)
model = Word2Vec(sentences = sentences, size = 100, sg = 1, window = 3, min_count = 1, iter = 10, workers = Pool()._processes)
model.init_sims(replace = True)
model.save('model.bin')
model = Word2Vec.load('model.bin')
print(model)

然后

model['aImIroawi']
array([-0.06561889, -0.15222837,  0.00912119, -0.11638119, -0.03242991,
       -0.13457145, -0.09813376,  0.07011288,  0.0711898 ,  0.10069774,
       -0.01028561,  0.11995316,  0.03737569, -0.01811702, -0.12935248],
      dtype=float32)

但我想将此模型用于具有 5333 词汇的 txt 文件,并将其保存到 txt 文件中的形式

{ 'Aimurawi : array([-0.04728228,  0.13645388,  0.13822217,  0.13086553, -0.0963688 ],dtype= float32),
 Tiona : array([-0.04728228,  0.13645388,  0.13822217,  0.13086553, -0.0963688 ], dype =float32)}

对于我的文本文件中的所有词汇,有人可以帮我怎么做吗?

4

0 回答 0