2

I am using AI-platform from the Google Cloud Platform to train a Random Forest Classifier with scikit-learn using this template from the Google Cloud Platform GitHub.

I have adjusted the code in some places to fit my own problem. The code is written in Python 3.5, using PyCharm and on an Ubuntu device. Training the model in the cloud works perfectly fine using the following terminal command (excluding the additional arguments):

gcloud ai-platform jobs submit training

But when I am trying to use the local training functionality of ai-platform inside my virtual environment (python 3.5):

gcloud ai-platform local train

(excluding the additional arguments). It returns the following error:

Traceback (most recent call last):
  File "/snap/google-cloud-sdk/99/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/snap/google-cloud-sdk/99/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/merijn/PycharmProjects/user-matching/trainer/task.py", line 28, in <module>
    from trainer import model
  File "trainer/model.py", line 28, in <module>
    from trainer import utils
  File "trainer/utils.py", line 23, in <module>
    from tensorflow import gfile
ImportError: No module named tensorflow

All the dependencies are properly installed within my virtual environment, including TensorFlow. Before the TensorFlow import error, it was an sklearn import error, which I solved by installing the sklearn module in my normal environment. This supports my guess that it probably has to do with the Google SDK running on python 2.7 in my normal environment. So when running the gcloud command within my venv, it most likely runs my whole program in my normal environment instead of my venv and so far I am unable force it to run in my venv. Note that I have already tried many different values for the arguments --job-dir and --package-path.

After days of searching the internet I still can't find a way to locally train with AI-platform in a virtual environment with python 3.5 installed. Hopefully you can help me out.

4

1 回答 1

3

你说的对。这与 gcloud 无法在本地执行 Python3 程序有关。

存在一个非常简单的解决方法——不要使用gcloud ai-platform local train. 相反,只需直接调用 python 解释器:

export PYTHONPATH=${PYTHONPATH}:/some/dir/package/path
python3 -m trainer.task --job-dir /tmp ...
于 2019-09-25T07:33:13.260 回答