我有一个包含 95 行和 9 列的数据集,并且想要进行 5 折交叉验证。在训练中,前 8 列(特征)用于预测第 9 列。我的测试集是正确的,但是当它应该只有 8 列时,我的 x 训练集的大小为 (4,19,9),而当它应该有 19 行时,我的 y 训练集的大小为 (4,9)。我是否错误地索引了子数组?
kdata = data[0:95,:] # Need total rows to be divisible by 5, so ignore last 2 rows
np.random.shuffle(kdata) # Shuffle all rows
folds = np.array_split(kdata, k) # each fold is 19 rows x 9 columns
for i in range (k-1):
xtest = folds[i][:,0:7] # Set ith fold to be test
ytest = folds[i][:,8]
new_folds = np.delete(folds,i,0)
xtrain = new_folds[:][:][0:7] # training set is all folds, all rows x 8 cols
ytrain = new_folds[:][:][8] # training y is all folds, all rows x 1 col