0

嗨,我想在蘑菇数据集上使用一个简单的AdaBoostClassifier 。喜欢:

target  cap-shape  cap-surface  cap-color  bruises  odor  \
3059       0          2            3          2        1     5   
1953       0          5            0          3        1     5   
1246       0          2            2          3        0     5   
5373       1          5            2          8        1     2   
413        0          5            3          9        1     3   

...

使用:

from sklearn.ensemble import AdaBoostClassifier
from sklearn.preprocessing import LabelEncoder
import pandas as pd

dataset = pd.read_csv('data\mushroom.csv',header=None)
dataset = dataset.sample(frac=1)
dataset.columns = ['target','cap-shape','cap-surface','cap-color','bruises','odor','gill-attachment','gill-spacing',
             'gill-size','gill-color','stalk-shape','stalk-root','stalk-surface-above-ring','stalk-surface-below-ring','stalk-color-above-ring',
             'stalk-color-below-ring','veil-type','veil-color','ring-number','ring-type','spore-print-color','population',
             'habitat']

for label in dataset.columns:
    dataset[label] = LabelEncoder().fit(dataset[label]).transform(dataset[label])


X = dataset.drop(['target'],axis=1)
Y = dataset['target']


AdaBoost = AdaBoostClassifier(base_estimator='DecisionTreeClassifier',n_estimators=400,learning_rate=0.01,algorithm='SAMME')

AdaBoost.fit(X,Y)

prediction = AdaBoost.score(Y)

print(prediction)

但这返回了我:

---> 15 AdaBoost.fit(X,Y)

AttributeError:“str”对象没有属性“fit”

4

2 回答 2

2

参考我在上面 2Obe 的回答中的评论,我找到了指定参数的正确方法 -

AdaBoostClassifier(base_estimator=DecisionTreeClassifier(),n_estimators=400,learning_rate=0.01,algorithm='SAMME')

它应该是构造函数而不是字符串

于 2018-07-11T13:53:15.910 回答
0

我发现了这个问题。作为 base_estimator,我设置了“DecisionTreeClassifier”。这是一个刺痛,没有 fit() 方法。AdaBoost 不是字符串。

from sklearn.ensemble import AdaBoostClassifier
from sklearn.preprocessing import LabelEncoder

for label in dataset.columns:
    dataset[label] = LabelEncoder().fit(dataset[label]).transform(dataset[label])

X = dataset.drop(['target'],axis=1)
Y = dataset['target']


AdaBoost = AdaBoostClassifier(n_estimators=400,learning_rate=0.01,algorithm='SAMME')

AdaBoost.fit(X,Y)

prediction = AdaBoost.score(X,Y)

print(prediction)

0.9182668636139832

于 2018-05-30T13:49:13.080 回答