我正在研究 word2vec,但是当我使用 word2vec 训练文本数据时,使用 Numpy 会发生 OverFlowError。
消息是,
model.vocab[w].sample_int > model.random.randint(2**32)]
Warning (from warnings module):
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 636
warnings.warn("C extension not loaded for Word2Vec, training will be slow. "
UserWarning: C extension not loaded for Word2Vec, training will be slow. Install a C compiler and reinstall gensim for fast training.
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Python34\lib\threading.py", line 920, in _bootstrap_inner
self.run()
File "C:\Python34\lib\threading.py", line 868, in run
self._target(*self._args, **self._kwargs)
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 675, in worker_loop
if not worker_one_job(job, init):
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 666, in worker_one_job
job_words = self._do_train_job(items, alpha, inits)
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 623, in _do_train_job
tally += train_sentence_sg(self, sentence, alpha, work)
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 112, in train_sentence_sg
word_vocabs = [model.vocab[w] for w in sentence if w in model.vocab and
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 113, in <listcomp>
model.vocab[w].sample_int > model.random.randint(2**32)]
File "mtrand.pyx", line 935, in mtrand.RandomState.randint (numpy\random\mtrand\mtrand.c:9520)
OverflowError: Python int too large to convert to C long
你能告诉我这些案例吗?
我的机器是 x64,操作系统是 windows 7,但 python34 是 32 位的。numpy 和 scipy 也是 32 位的。