3

I am packaging a project that uses nltk. When you install nltk with pip, you get core functionalitiy, but not all the modules that come with it. To get those modules, you call nltk's download method.

I tried the following, but it doesn't work, saying ImportError: No module named nltk. I assume this is happening because import nltk occurs before nltk is installed by the call to setup(...).

Is there a clean way of having a post-install step with distribute that executes one of the following?

$ python -m nltk.downloader punkt
>>> import nltk; nltk.download('punkt')

Here's my failed attempt at setup.py:

class my_install(install):
    def run(self):
        install.run(self)
        import nltk
        nltk.download('punkt')

setup(
    ...
    install_requires = [..., 'nltk==2.0.4'],
    cmdclass={'install': my_install},
)
4

2 回答 2

0

pip 不处理依赖项,因此您需要编写一个 README 文件并解释您的用户他们需要安装什么,或者一个脚本在您需要的所有东西上运行 pip install。

第二种方式是我认为的方式,以及解释正在发生的事情的 README 文件。

作为一个 debian 维护者,我可以告诉你,在那里执行下载东西的安装命令被认为是不可接受的,它必须被打包列出对其他包的依赖关系,然后在满足依赖关系时安装你的,我认为它是一般而言,这是一种理智的方式。http://wiki.debian.org/UpstreamGuide#No_Downloads

于 2013-03-11T20:14:51.387 回答
0

我用命令行安装方法,成功了。像这样...

import subprocess

class my_install(install):
    def run(self):
        install.run(self)
        cmd = ["python", "-m", "nltk.downloader", "punkt"]
        with subprocess.Popen(cmd, stdout=subprocess.PIPE) as proc:
            print(proc.stdout.read())
于 2016-06-25T06:56:15.323 回答