5

我一直在努力使用 pybrain 创建一个神经网络,并且由于某种原因在通过传播对其进行训练后,它无法训练我的网络。我在外维度中使用超过两个类的任何数据集只会将我的所有观察结果归为一个类别。有谁知道为什么会这样?代码和一些输出如下。

import scipy
import numpy
from pybrain.datasets            import ClassificationDataSet
from pybrain.utilities           import percentError
from pybrain.tools.shortcuts     import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.structure.modules   import SoftmaxLayer
from sklearn.metrics             import precision_score,recall_score,confusion_matrix
def makeDataset(CSVfile,ClassFile):
    #import the features to data, and their classes to dataClasses
    data=numpy.genfromtxt(CSVfile,delimiter=",")
    classes=numpy.genfromtxt(ClassFile,delimiter=",")
    print("Building the dataset from CSV files")
    #Initialize an empty Pybrain dataset, and populate it
    alldata=ClassificationDataSet(len(data[0]),1,nb_classes=3)
    for count in range(len((classes))):
        alldata.addSample(data[count],[classes[count]])
    return alldata



def makeNeuralNet(alldata,trainingPercent=.3,hiddenNeurons=5,trainingIterations=20):
    #Divide the data set into training and non-training data    
    testData, trainData = alldata.splitWithProportion(trainingPercent)
    testData._convertToOneOfMany( )
    trainData._convertToOneOfMany( )
    #Then build the network, and using backwards propogation to train it
    network = buildNetwork( trainData.indim, hiddenNeurons, trainData.outdim, outclass=SoftmaxLayer )
    trainer = BackpropTrainer( network, dataset=trainData, momentum=0.1, verbose=True, weightdecay=0.01)
    for i in range(trainingIterations):
        print("Training Epoch #"+str(i))
        trainer.trainEpochs( 1 )
    return [network,trainer]



def checkNeuralNet(trainer,alldata):
    predictedVals=trainer.testOnClassData(alldata)
    actualVals=list(alldata['target'])
##    for row in alldata['target']:
##        row=list(row)
##        index=row.index(1)
##        actualVals+=[index]
    print("-----------------------------")
    print("-----------------------------")
    print("The precision is "+str(precision_score(actualVals,predictedVals)))
    print("The recall is "+str(recall_score(actualVals,predictedVals)))
    print("The confusion matrix is as shown below:")
    print(confusion_matrix(actualVals,predictedVals))


CSVfile="/home/ubuntu/test.csv"
ClassFile="/home/ubuntu/test_Classes.csv"
#Build our dataset
alldata=makeDataset(CSVfile,ClassFile)
#Build and train the network
net=makeNeuralNet(alldata,trainingPercent=.7,hiddenNeurons=20,trainingIterations=20)
network=net[0]
trainer=net[1]
#Check it's strength
checkNeuralNet(trainer,alldata)

训练的最后一个 epoch 有 0.09 的错误,如下面的输出所示:

Training Epoch #19
Total error: 0.0968444196605

然而,当我打印混淆矩阵、精度和召回率时,我得到以下以及这个奇怪的错误:

UserWarning: The sum of true positives and false positives are equal to zero for some labels. Precision is ill defined for those labels [1 2]. The precision and recall are equal to zero for some labels. fbeta_score is ill defined for those labels [1 2]. 
  average=average)
The precision is 0.316635552252
UserWarning: The sum of true positives and false positives are equal to zero for some labels. Precision is ill defined for those labels [1 2]. The precision and recall are equal to zero for some labels. fbeta_score is ill defined for those labels [1 2]. 
  average=average)
The recall is 0.562703787309
The confusion matrix is as shown below:
[[4487    0    0]
 [ 987    0    0]
 [2500    0    0]]
4

1 回答 1

1

我有非常相似的问题,我发现SoftmaxLayer是原因。尝试用其他东西替换它,例如SigmoidLayer. 如果这在您的情况下也是一个问题,那么这个类很可能是错误的。

于 2014-08-12T08:13:18.263 回答