0

使用:Watson Studio Python 3.5 和 Spark Python 笔记本:https ://gist.github.com/anonymous/ea77f500b4fd80feb69fadb470fca235

这部分给出了错误:

from IPython.display import Image  
import pydotplus
dot_data = tree.export_graphviz(regr, out_file=None, feature_names = X_train.columns.values ,filled=True)  
graph = pydotplus.graph_from_dot_data(dot_data)  

给出错误: ImportError: No module named 'pydotplus'

解决方案 是否有其他环境实际安装了该模块?或者有没有办法将这个 python 模块安装/添加到现有的运行时?

4

1 回答 1

1

在 IBM Cloud 文档中找到了答案。

https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/importing-libraries.html

在 Apache Spark 上安装自定义库和包 最后更新时间:2019 年 3 月 1 日 2

当您将 Apache Spark 与 Watson Studio 中的笔记本关联时,会包含许多预安装的库。在安装库之前,请检查预安装库的列表。从笔记本单元运行适当的命令:

Python: !pip list --isolated
R: installed.packages()

如果您想要的库未列出,或者您想在笔记本中使用 Scala 库,请使用以下部分中的步骤来安装它。库包的格式取决于编程语言。使用 Scala 库

Scala 笔记本的库通常打包为 Java™ 归档 (JAR) 文件。临时缓存库

Scala 笔记本的库未安装到 Spark 服务。相反,它们在下载时被缓存,并且仅在笔记本运行时可用。

To use a single library without dependencies, from a public web server:
    Locate the publicly available URL to the library that you want to install. If you create a custom library, you can post it to any publicly available repository, such as GitHub.

    Download the library you want to use in your notebook by running the following command in a code cell:

     %AddJar URL_to_jar_file  

To use a library with dependencies, from a public Maven repository:

    Add and import a library with all its dependencies by running the following command. You need the groupId, artifactId, and version of the dependency. For example:

     %AddDeps org.apache.spark spark-streaming-kafka_2.10 1.1.0 --transitive

永久安装库

如果您希望文件可用于 spark-submit 作业和 Scala 内核,或者希望通过 Java 桥从其他内核访问文件,例如使用 JDBC 驱动程序,您可以将库永久安装到 ~/data/libs/来自 Python 或 R。

已安装库到 ~/data/libs/ 的文件路径因库所需的 Scala 版本而异:

Use ~/data/libs/ for libraries that work with any Scala version.
Use ~/data/libs/scala-2.11/ for libraries that require Scala 2.11. The Scala kernel for Spark 2.1 uses Scala 2.11.

要安装库:

Locate the publicly available URL to the library that you want to install.

Download the library you want to install permanently into ~/data/libs/ by running the following command in a Python notebook:

 !(cd ~/data/libs/ ; wget URL_to_jar_file)

安装 Python 库

Use the Python pip package installer command to install Python libraries to your notebook. For example, run the following command in a code cell to install the prettyplotlib library:

 !pip install --user prettyplotlib

The --user flag installs the library for personal usage rather than the global default. The installed packages can be used by all notebooks that use the same Python version in the Spark service.
Use the Python import command to import the library components. For example, run the following command in a code cell:

 import prettyplotlib as ppl

Restart the kernel.

加载 R 包

Use the R install.packages() function to install new R packages. For example, run the following command in a code cell to install the ggplot2 package for plotting functions:

 install.packages("ggplot2")

The imported package can be used by all R notebooks running in the Spark service.

Use the R library() function to load the installed package. For example, run the following command in a code cell:

 library("ggplot2")

You can now call plotting functions from the ggplot2 package in your notebook.
于 2019-04-05T12:06:19.010 回答