python-3.x - 将 google ml 引擎预测用于需要额外模块的 sci-kit 学习模型

Question

我在一个单独的文件中定义了我的管道model.py

class TextSelector(BaseEstimator, TransformerMixin):
    def __init__(self, field):
        self.field = field
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        return X[self.field]

class NumberSelector(BaseEstimator, TransformerMixin):
    def __init__(self, field):
        self.field = field
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        return X[[self.field]]

text_features = Pipeline([
    ('selector', TextSelector(field='text')),
    ('vectorizer', TfidfVectorizer(min_df=5, max_df=0.25, ngram_range=(1, 1))),
    ('decomposer', TruncatedSVD(n_components=300))
])

features = FeatureUnion([
    ('text_features', text_features),
    ('other_feature', NumberSelector(field='other')),
])

pipeline = Pipeline([
    ('features', features),
    ('lgbm', LGBMClassifier(max_depth=-1, n_estimators=300,
                            learning_rate=0.1, n_jobs=2,
                            class_weight='balanced'))
])

训练和转储模型

from model import pipeline

clf = pipeline.fit(X, y)
joblib.dump(clf, 'model.joblib')

为了加载模型，脚本需要访问model.py. 使用 google ml 引擎时我应该把这个文件放在哪里？

我试过了

gcloud ml-engine local predict --model-dir=/path/to/models  --json-instances=input.json --framework=SCIKIT_LEARN

与model.py内部path/to/models目录。

错误

cloud.ml.prediction.prediction_utils.PredictionError：加载模型失败：无法加载模型：/path/to/the/model/model.joblib。没有名为“模型”的模块。（错误代码：0）

另一个问题是是否可以lightgbm在 ml-engine 预测中使用它？

score 0 · Accepted Answer

Google 最近推出了“自定义预测例程”，它可以让您加载不同的包或在预测中调用自己的方法：https ://cloud.google.com/ml-engine/docs/tensorflow/custom-prediction-routines

python-3.x - 将 google ml 引擎预测用于需要额外模块的 sci-kit 学习模型

1 回答 1

Related

Reference