4

我运行 TWE 模型的源代码。我需要编译python的C扩展。我已经为 Python 2.7 和 Cython 安装了 Microsoft Visual C++ 编译器。

首先,我需要运行 TWE/train.py:

import gensim
sentence_word = gensim.models.word2vec.LineSentence("tmp/word.file")
print "Training the word vector..."
w = gensim.models.Word2Vec(sentence_word,size=400, workers=20)
sentence = gensim.models.word2vec.CombinedSentence("tmp/word.file","tmp/topic.file")
print "Training the topic vector..."
w.train_topic(topic_number, sentence)
print "Saving the topic vectors..."
w.save_topic("output/topic_vector.txt")
print "Saving the word vectors..."
w.save_wordvector("output/word_vector.txt")`

二、TWE/gensim/models/wor2vec.py:

try:
    raise ImportError  # ignore for now
    from gensim_addons.models.word2vec_inner import train_sentence_sg,train_sentence_cbow, FAST_VERSION, train_sentence_topic
except ImportError:
    try:
        import pyximport
        print 'import pyximport'
        models_dir = os.path.dirname(__file__) or os.getcwd()
        print 'models_dir'
        pyximport.install(setup_args={"include_dirs": [models_dir, get_include()]})
        print 'pyximport'   # is the follow code's problem
        from word2vec_inner import train_sentence_sg, train_sentence_cbow, 
        FAST_VERSION, train_sentence_topic
        print 'from word2vec'
    except:
        FAST_VERSION = -1
        def train_sentence_sg(model, sentence, alpha, work=None):
                   ...
        def train_sentence_cbow(model, sentence, alpha, work=None, neu1=None):
                   ...
class Word2Vec(utils.SaveLoad):
                   ...
    def train(self, sentences, total_words=None, word_count=0, chunksize=100):
        if FAST_VERSION < 0:
        import warnings
        warnings.warn("Cython compilation failed, training will be slow. Do you have Cython installed? `pip install cython`")
        logger.info("training model with %i workers on %i vocabulary and %i features, "
        "using 'skipgram'=%s 'hierarchical softmax'=%s 'subsample'=%s and 'negative sampling'=%s" %
        (self.workers, len(self.vocab), self.layer1_size, self.sg, self.hs, self.sample, self.negative))
         def worker_train():
              ...
             if self.sg:
                 job_words = sum(train_sentence_topic(self, sentence, alpha, work) for sentence in job)
             else:
                 ob_words = sum(train_sentence_cbow(self, sentence, alpha, work, neu1) for sentence in job)`
              ...

第三,我已经用 setup.py 编译了 TWE/gensim/models/word2vec_inner.pyx:

from distutils.core import setup  
from distutils.extension import Extension  
from Cython.Build import cythonize  
import numpy  
extensions = [  
    Extension("word2vec_inner", ["word2vec_inner.pyx"],  
              include_dirs=[numpy.get_include()])  
]  
setup(  
    name="word2vec_inner",  
    ext_modules=cythonize(extensions),  
)

通过使用命令“python setup.py install”,我已经编译了 word2vec_inner.pyx。但出现以下错误:

E:\Python27\python.exe D:/pycharm/TWE/TWE1/train.py wordmap.txt model-final.tassign 100
import pyximport
models_dir
pyximport
word2vec_inner.c
e:\python27\lib\site-packages\numpy\core\include\numpy\npy_1_7_deprecated_api.h(12) : Warning Msg : Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
C:\Users\hp\.pyxbld\temp.win32-2.7\Release\pyrex\gensim\models\word2vec_inner.c(15079) : warning C4244:
'initializing' : conversion from 'double' to 'float', possible loss of data
C:\Users\hp\.pyxbld\temp.win32-2.7\Release\pyrex\gensim\models\word2vec_inner.c(15085) : warning C4244 : 'initializing' : conversion from 'double' to 'float', possible loss of data
LINK : fatal error LNK1104: cannot open file 'C:\Users\hp\.pyxbld\lib.win32-2.7\gensim\models\word2vec_inner.pyd'
Training the word vector...
D:\pycharm\TWE\TWE1\gensim\models\word2vec.py:410: UserWarning: Cython compilation failed, training will be slow. Do you have Cython installed? `pip install cython`
warnings.warn("Cython compilation failed, training will be slow. Do you have Cython installed? `pip install cython`")
PROGRESS: at 100.00% words, alpha 0.02500, 2556 words/s
Training the topic vector...
D:\pycharm\TWE\TWE1\gensim\models\word2vec.py:882: UserWarning: Cython compilation failed, training will be slow. Do you have Cython installed? `pip install cython`
warnings.warn("Cython compilation failed, training will be slow. Do you have Cython installed? `pip install cython`")
Exception in thread Thread-23:
Traceback (most recent call last):
  File "E:\Python27\lib\threading.py", line 801, in __bootstrap_inner
      self.run()
  File "E:\Python27\lib\threading.py", line 754, in run
      self.__target(*self.__args, **self.__kwargs)
  File "D:\pycharm\TWE\TWE1\gensim\models\word2vec.py", line 909, in worker_train
     job_words = sum(train_sentence_topic(self, sentence, alpha, work) for sentence in job)
  File "D:\pycharm\TWE\TWE1\gensim\models\word2vec.py", line 909, in <genexpr>
     job_words = sum(train_sentence_topic(self, sentence, alpha, work) for sentence in job)
NameError: global name 'train_sentence_topic' is not defined

Saving the topic vectors...
Saving the word vectors...

Process finished with exit code 0

我检查了 .pyx 文件是否已正确编译并且还安装了 cython。总之,它无法从 gensim/models/word2vec_inner 或 gensim_addons/models/word2vec_inner 导入 train_sentence_sg、train_sentence_cbow、FAST_VERSION、train_sentence_topic。所以就出现了这些问题。但为什么?我已经在两个方向上正确编译了 .pyx 文件。任何人都可以帮助我吗?这个问题困扰了我好几天。请帮帮我,谢谢!

4

1 回答 1

4

我在 PyCharm 2018.1 + Python 3.6.2 中遇到了同样的问题。

这一行是关键:

LINK : fatal error LNK1104: cannot open file 'C:\Users\hp\.pyxbld\lib.win32-2.7\gensim\models\word2vec_inner.pyd'

此错误消息具有误导性。对于 Python,这个错误实际上意味着:

  • cannot open file to write to it.

该文件可能已被某个进程锁定以进行写入,因此链接器无法完成其工作。

解决方案 1

之前的 Python 行import word2vec_inner锁定了对文件的写入。重置 Python 控制台以解除锁定:

在此处输入图像描述

解决方案 2

使用Process Explorer找出锁定文件的程序。使用 Ctrl-F,然后键入有问题的锁定文件的名称,它将为您提供锁定文件的进程。

解决方案 3

退出 Pycharm,然后在命令行上将 Cython 文件预编译成一个包。如果此链接更改,这里是一个镜像:

想象一下文件 hello.pyx 中的一个简单的“hello world”脚本:

def say_hello_to(name):
    print("Hello %s!" % name)

以下可能是相应的 setup.py 脚本:

from distutils.core import setup
from Cython.Build import cythonize

setup(
  name = 'Hello world app',
  ext_modules = cythonize("hello.pyx"),
)

要构建,运行python setup.py build_ext --inplace. 然后只需启动一个 Python 会话并执行 from hello import say_hello_to 并使用您认为合适的导入函数。

如果您使用 setuptools 而不是 distutils,则需要注意一个问题,运行 python setup.py install 时的默认操作是创建一个压缩的 egg 文件,当您尝试从依赖包中使用 pxd 文件时,该文件将无法与 cimport 一起使用。为了防止这种情况,请在 setup() 的参数中包含 zip_safe=False。

于 2018-05-18T07:28:22.033 回答