python - SystemML：无法导入子模块 mllearn（因此无法导入 Keras2DML 函数）

Question

我正在使用 IBM Watson Studio（默认 spark python 环境）并尝试将 Keras 模型转换为 systemml DML 并在 Spark 上对其进行训练。

!pip install systemml 
import systemml

这执行得很好。但是这个 -

from systemml import mllearn

抛出 SyntaxError: import * only allowed at module level

dir(systemml)

不显示 mllearn。

我尝试从http://www.romeokienzler.com/systemml-1.0.0-SNAPSHOT-python.tar.gz和https://sparktc.ibmcloud.com/repo/latest/systemml-1.0.0-安装它SNAPSHOT-python.tar.gz 和一个 git clone 但不成功。我究竟做错了什么？

score 2 · Accepted Answer

您需要执行 dir(systemml.mllearn) 才能查看 mllearn 函数。

>>> dir(systemml.mllearn)
['Caffe2DML', 'Keras2DML', 'LinearRegression', 'LogisticRegression', 
'NaiveBayes', 'SVM', '__all__', '__builtins__', '__doc__', '__file__', 
'__name__', '__package__', '__path__', 'estimators']

请从 pypi.org 安装 SystemML 1.2。1.2 是 2018 年 8 月以来的最新版本。版本 1.0 仅提供实验性支持。

您能否尝试只导入 MLContext，看看加载主 SystemML jar 文件是否有效，以及您的安装使用什么版本？

>>> from systemml import MLContext
>>> ml = MLContext(sc)

Welcome to Apache SystemML!
Version 1.2.0

>>> print (ml.buildTime())
2018-08-17 05:58:31 UTC

>>> from sklearn import datasets, neighbors
>>> from systemml.mllearn import LogisticRegression

>>> y_digits = digits.target 
>>> n_samples = len(X_digits) 
>>> X_train = X_digits[:int(.9 * n_samples)] 
>>> y_train = y_digits[:int(.9 * n_samples)] 
>>> X_test = X_digits[int(.9 * n_samples):] 
>>> y_test = y_digits[int(.9 * n_samples):] 
>>> 
>>> logistic = LogisticRegression(spark)
>>> 
>>> print('LogisticRegression score: %f' % logistic.fit(X_train, y_train).score(X_test, y_test))
18/10/20 00:15:52 WARN BaseSystemMLEstimatorOrModel: SystemML local memory     budget:5097 mb. Approximate free memory available on the driver JVM:416 mb.
18/10/20 00:15:52 WARN StatementBlock: WARNING: [line 81:0] -> maxinneriter --     Variable maxinneriter defined with different value type in if and else clause.
18/10/20 00:15:53 WARN SparkExecutionContext: Configuration parameter     spark.driver.maxResultSize set to 1 GB. You can set it through Spark default configuration setting either to 0 (unlimited) or to available memory budget of size 4 GB.
BEGIN MULTINOMIAL LOGISTIC REGRESSION SCRIPT
...

score 1 · Accepted Answer

最后，如果您正在使用 IBM 云笔记本，这是完美的工作

1)

! pip install --upgrade https://github.com/niketanpansare/future_of_data/raw/master/systemml-1.3.0-SNAPSHOT-python.tar.gz

2)

!ln -s -f /home/spark/shared/user-libs/python3/systemml/systemml-java/systemml-1.3.0-SNAPSHOT-extra.jar ~/user-libs/spark2/systemml-1.3.0-SNAPSHOT-extra.jar


!ln -s -f /home/spark/shared/user-libs/python3/systemml/systemml-java/systemml-1.3.0-SNAPSHOT.jar ~/user-libs/spark2/systemml-1.3.0-SNAPSHOT.jar

~~
_

score 1 · Accepted Answer

该代码适用于 Python 2.7 内核，但不适用于 Python 3.5 内核。提交https://github.com/apache/systemml/commit/9e7ee19a45102f7cbb37507da25b1ba0641868fd修复了 Python 3.5 的问题。如果您想在本地环境中修复较旧的发布版本，请执行以下两个步骤：

A. 修复 Python 3.5 的缩进要求：

pip install autopep8
find /<location>/systemml/ -name '*.py' | xargs autopep8 --in-place --aggressive
find /<location>/systemml/mllearn/ -name '*.py' | xargs autopep8 --in-place --aggressive

你可以找到<location>使用pip show systemml

B. 修复更严格的 Python 3.5 语法：替换 mllearn/estimator.py 中的行

from .keras2caffe import *

和

import keras
from .keras2caffe import convertKerasToCaffeNetwork, convertKerasToCaffeSolver, convertKerasToSystemMLModel

由于修复已经交付，您将不得不等待下一个版本，即 1.3.0。或者，您可以构建和安装最新版本：

git clone https://github.com/apache/systemml.git
cd systemml
mvn package -P distribution
pip install target/systemml-1.3.0-SNAPSHOT-python.tar.gz

谢谢，

耐克坦。

python - SystemML：无法导入子模块 mllearn（因此无法导入 Keras2DML 函数）

3 回答 3

Related

Reference