2

我在 stackoverflow 上查找了这个错误并找到了几篇帖子,但没有人解决这种特定情况。

我有以下数据框:

df

输入变量和输出变量在此代码中定义:

xcol=["h","o","p","d","ddlt","devdlt","sl","lt"]
ycol=["Q","r"]
x=df[xcol].values
y=df[ycol].values

我的目标是根据输入 (x) 猜测输出值 Q & r。我尝试了两种方法,但都失败了。对于第一个,我尝试了一个多输出回归器。

我首先将测试数据和训练数据中的数据分开:

import numpy as np
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
y_train = y_train.ravel()
y_test = y_test.ravel()

然后导入函数:

from sklearn.multioutput import MultiOutputRegressor

然后尝试预测 Q & r:

reg= MultiOutputRegressor(estimator=100, n_jobs=None)
reg=reg.predict(X_train, y_train)

这给了我错误:

TypeError: predict() takes 2 positional arguments but 3 were given

我做错了什么,我该如何解决?

接下来我尝试的是神经网络。分配 x 和 y 列后,我制作了神经网络:

# neural network class definition
class neuralNetwork:

#Step 1: 
def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):
    #set number of nodes in each input, hidden, output layer
    self.inodes = inputnodes
    self.hnodes = hiddennodes
    self.onodes = outputnodes

    #link weight matrices, wih and who (weights in hidden en output layers), 
    # we are going to create matrices for the multiplication of it to get an 
    # output
    # weights inside the arrays (matrices) are w_i_j, where link is from node 
    # i to node j in the next layer
    #w11 w21
    #w12 w22 etc
    self.wih = numpy.random.normal(0.0,pow(self.inodes,-0.5),( self.hnodes, 
    self.inodes))
    self.who = numpy.random.normal(0.0,pow(self.hnodes,-0.5),( self.onodes, 
    self.hnodes))

    # setting the learning rate
    self.lr = learningrate

    # activation function is the sigmoid function
    self.activation_function = lambda x: scipy.special.expit(x)

    pass

    #Step 2:
def train(self, inputs_list, targets_list):
    #convert input lists to 2d array (matrice)
    inputs = numpy.array(inputs_list, ndmin=2).T
    targets = numpy.array(targets_list, ndmin=2).T

    #calculate signals into hidden layer
    hidden_inputs = numpy.dot(self.wih, inputs)
    #calculate signals emerging from hidden layer
    hidden_outputs = self.activation_function(hidden_inputs)

    #calculate signals into final output layer
    final_inputs = numpy.dot(self.who, hidden_outputs)
    #calculate signals emerging from final output layer
    final_outputs = self.activation_function(final_inputs)
    # output layer error is the (target-actual)
    output_errors = targets -final_outputs
    #hidden layer error is the output_errors, split by weights, recombined 
    at hidden nodes
    hidden_errors = numpy.dot(self.who.T, output_errors)

    #update the weights for the links between the hidden and output layers
    self.who += self.lr * numpy.dot((output_errors*final_outputs * (1.0- 
    final_outputs)),numpy.transpose(hidden_outputs))

    # update the weights for the links between the input and hidden layers
    self.wih += self.lr*numpy.dot((hidden_errors*hidden_outputs*(1.0- 
    hidden_outputs)),numpy.transpose(inputs))

    pass

    #Step 3
def query(self, inputs_list):
    #convert input lists to 2d array (matrice)
    inputs = numpy.array(inputs_list, ndmin=2).T

    #calculate signals into hidden layer
    hidden_inputs = numpy.dot(self.wih, inputs)
    #calculate signals emerging from hidden layer
    hidden_outputs = self.activation_function(hidden_inputs)

    #calculate signals into final output layer
    final_inputs = numpy.dot(self.who, hidden_outputs)
    #calculate signals emerging from final output layer
    final_outputs = self.activation_function(final_inputs)

    return final_outputs

然后我创建了一个神经网络的实例:

   #Creating instance of neural network 

   #number of input, hidden and output nodes
   input_nodes = 8
   hidden_nodes = 100
   output_nodes = 2

   #learning rate is 0.8
   learning_rate = 0.8

   #create instance of neural network
   n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

我得到了需要预测的 8 个输入和 2 个输出。

然后我训练了神经网络:

# train the neural network
# go through all records in the training data set 
for record in df:
    #scale and shift te inputs
    inputs = x
    #create the target output values 
    targets = y
    n.train(inputs, targets)
    pass

然后我想查询猜测的输出,现在它出错了:

所以我想用 Q (Q*) & r (r*) 的猜测在数据框中添加 2 个额外的列:

df["Q*","r*"] = n.query(x)

我真的不知道如何正确地做到这一点。上面的代码给了我错误:

ValueError: Length of values does not match length of index

任何帮助表示赞赏。

史蒂文

4

1 回答 1

2

关于您问题的第一部分(MultiOutputRegressor),您的代码存在几个问题......

首先,estimator参数 ofMultiOutputRegressor不应该是一个数字,但是,正如文档所说:

估计器:估计器对象

一个实现拟合和预测的估计器对象。

因此,例如,要使用具有默认参数的随机森林,您应该使用

reg = MultiOutputRegressor(RandomForestRegressor()) 

(有关更多示例,请参见此答案)

其次,在您的代码中,您永远不会适合您的回归器;你应该添加

reg.fit(X_train, y_train)

在定义之后。

第三,predict不以基本真值(y_train此处)作为参数,仅将特征(X_train);再次来自文档

预测(X)

使用为每个目标变量训练的模型预测多输出变量。

参数:X:(稀疏)类数组,形状(n_samples,n_features)

数据。

返回: y :(稀疏)类数组,形状(n_samples,n_outputs)

跨多个预测器预测的多输出目标。注意:为每个预测变量生成单独的模型。

由于您还传递y_train了代码,因此您会收到一个参数过多的预期错误;只需将其更改为reg.predict(X_train),就可以了。

于 2018-11-27T21:46:15.073 回答