0

将 train_test_split 用于 iris 时收到输入形状错误。我不明白为什么。我已经测试了其他数据集。train_test_split 应该处理这个形状。有什么建议么?谢谢

    # Decision Tree Classifier
from sklearn import datasets
from sklearn import metrics
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
# load the iris datasets
iris = datasets.load_iris()
#print(iris.data)
#print(dataset)
# fit a CART model to the data
model = LogisticRegression()
from sklearn.utils import shuffle
import numpy as np
#print(type(dataset.data))

#Xtrain = dataset.data[:int(0.8*len(dataset.data))]
#Ytrain = dataset.target[:int(0.8*len(dataset.data))]
#Xtest = dataset.data[int(0.8*len(dataset.data)):]
#Ytest = dataset.target[int(0.8*len(dataset.data)):]
Xtrain, Ytrain, Xtest, Ytest = train_test_split(iris.data.astype(np.float64), iris.target.astype(np.float64), test_size=0.4, train_size=0.6)
model.fit(Xtrain,Ytrain)
#print(model)
# make predictions
expected = Ytest
predicted = model.predict(Xtest)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))
4

1 回答 1

0

更改代码中的以下行,它应该可以工作!

Xtrain, Ytrain, Xtest, Ytest = train_test_split(iris.data.astype(np.float64), iris.target.astype(np.float64), test_size=0.4, train_size=0.6)

至:

Xtrain, Xtest, Ytrain, Ytest = train_test_split(iris.data.astype(np.float64), iris.target.astype(np.float64), test_size=0.4, train_size=0.6)

您接受的参数顺序不正确。见:http ://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

于 2017-03-04T18:41:43.613 回答