17

是否可以在 scikit-learn 中调整嵌套管道的参数?例如:

svm = Pipeline([
    ('chi2', SelectKBest(chi2)),
    ('cls', LinearSVC(class_weight='auto'))
])

classifier = Pipeline([
    ('vectorizer', TfIdfVectorizer()),
    ('ova_svm', OneVsRestClassifier(svm))
})

parameters = ?

GridSearchCV(classifier, parameters)

如果无法直接执行此操作,有什么解决方法?

4

2 回答 2

28

scikit-learn 对此有一个双下划线表示法,如此处所示OneVsRestClassifier它以递归__estimator方式工作并扩展到

parameters = {'ova_svm__estimator__cls__C': [1, 10, 100],
              'ova_svm__estimator__chi2_k': [200, 500, 1000]}
于 2013-05-08T09:38:45.893 回答
21

对于您创建的估算器,您可以获取参数列表及其标签,如下所示。

import pprint as pp

pp.pprint(sorted(classifier.get_params().keys()))

['ova_svm', 'ova_svm__estimator', 'ova_svm__estimator__chi2', 'ova_svm__estimator__chi2__k', 'ova_svm__estimator__chi2__score_func', 'ova_svm__estimator__cls', 'ova_svm__estimator__cls__C', 'ova_svm__estimator__cls__class_weight', 'ova_svm__estimator__cls__dual', 'ova_svm__estimator__cls__fit_intercept', 'ova_svm__estimator__cls__intercept_scaling', 'ova_svm__estimator__cls__loss', ' ova_svm__estimator__cls__max_iter','ova_svm__estimator__cls__multi_class','ova_svm__estimator__cls__penalty','ova_svm__estimator__cls__random_state','ova_svm__estimator__cls__tol','ova_svm__estimator__cls__verbose','ova_svm__estimator__steps'steps','vectorizer','vectorizer__analyzer','vectorizer__binary','vectorizer__decode_error','vectorizer__dtype','vectorizer__encoding','vectorizer__input','vectorizer__lowercase','vectorizer__max_df','vectorizer__max_features','vectorizer__min_df','vectorizer__ngram_range ','vectorizer__norm','vectorizer__preprocessor','vectorizer__smooth_idf','vectorizer__stop_words','vectorizer__strip_accents','vectorizer__sublinear_tf','vectorizer__token_pattern','vectorizer__tokenizer','vectorizer__use_idf','vectorizer__vocabulary']vectorizer__binary','vectorizer__decode_error','vectorizer__dtype','vectorizer__encoding','vectorizer__input','vectorizer__lowercase','vectorizer__max_df','vectorizer__max_features','vectorizer__min_df','vectorizer__ngram_range','vectorizer__norm','vectorizer__preprocessor','vectorizer__smooth_idf' ,'vectorizer__stop_words','vectorizer__strip_accents','vectorizer__sublinear_tf','vectorizer__token_pattern','vectorizer__tokenizer','vectorizer__use_idf','vectorizer__vocabulary']vectorizer__binary','vectorizer__decode_error','vectorizer__dtype','vectorizer__encoding','vectorizer__input','vectorizer__lowercase','vectorizer__max_df','vectorizer__max_features','vectorizer__min_df','vectorizer__ngram_range','vectorizer__norm','vectorizer__preprocessor','vectorizer__smooth_idf' ,'vectorizer__stop_words','vectorizer__strip_accents','vectorizer__sublinear_tf','vectorizer__token_pattern','vectorizer__tokenizer','vectorizer__use_idf','vectorizer__vocabulary']vectorizer__lowercase','vectorizer__max_df','vectorizer__max_features','vectorizer__min_df','vectorizer__ngram_range','vectorizer__norm','vectorizer__preprocessor','vectorizer__smooth_idf','vectorizer__stop_words','vectorizer__strip_accents','vectorizer__sublinear_tf','vectorizer__tokenizer'patternizer , 'vectorizer__use_idf', 'vectorizer__vocabulary']vectorizer__lowercase','vectorizer__max_df','vectorizer__max_features','vectorizer__min_df','vectorizer__ngram_range','vectorizer__norm','vectorizer__preprocessor','vectorizer__smooth_idf','vectorizer__stop_words','vectorizer__strip_accents','vectorizer__sublinear_tf','vectorizer__tokenizer'patternizer , 'vectorizer__use_idf', 'vectorizer__vocabulary']vectorizer__strip_accents','vectorizer__sublinear_tf','vectorizer__token_pattern','vectorizer__tokenizer','vectorizer__use_idf','vectorizer__vocabulary']vectorizer__strip_accents','vectorizer__sublinear_tf','vectorizer__token_pattern','vectorizer__tokenizer','vectorizer__use_idf','vectorizer__vocabulary']

然后,您可以从此列表中设置要执行 GridSearchCV 的参数。

于 2016-04-23T02:03:58.357 回答