1

我得到ValueError: Invalid parameter...了网格中的每一行。

我尝试逐行删除每个网格选项,直到网格为空。我复制并粘贴了参数的名称,pipeline.get_params()以确保它们没有拼写错误。

from sklearn.model_selection import train_test_split
x_in, x_out, y_in, y_out = train_test_split(X, Y, test_size=0.2, stratify=Y)

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline
from sklearn.feature_selection import SelectKBest, chi2, f_classif
from sklearn.svm import LinearSVC
from sklearn.model_selection import GridSearchCV

grid = {
    'TF-IDF__ngram_range':[(1,2),(2,3)],
    'TF-IDF__stop_words': [None, 'english'],
    'SelectKBest__k': [10000, 15000],
    'SelectKBest__score_func': [f_classif, chi2],
    'linearSVC__penalty': ['l1', 'l2']
}

pipeline = Pipeline([('tfidf', TfidfVectorizer(sublinear_tf=True)),
                     ('selectkbest', SelectKBest()),
                     ('linearscv', LinearSVC(max_iter=10000, dual=False))])

grid_search = GridSearchCV(pipeline, param_grid=grid, scoring='accuracy', n_jobs=-1, cv=5)
grid_search.fit(X=x_in, y=y_in)

4

1 回答 1

2

我认为您不是指网格上具有正确名称的管道阶段。您在管道上为每个阶段分配的名称(tfidf、selectkbest、linearscv)应该与网格中的名称相同。我会做:

pipeline = Pipeline([('tfidf', TfidfVectorizer(sublinear_tf=True)),
                     ('selectkbest', SelectKBest()),
                     ('linearscv', LinearSVC(max_iter=10000, dual=False))]) 
grid = {
    'tfidf__ngram_range':[(1,2),(2,3)],
    'tfidf__stop_words': [None, 'english'],
    'selectkbest__k': [10000, 15000],
    'selectkbest__score_func': [f_classif, chi2],
    'linearscv__penalty': ['l1', 'l2'] }
于 2019-09-15T19:38:47.510 回答