0

我试图了解如何搜索超参数,因此我尝试使用 scikit-optimize 为我的网络找到最佳超参数。我在此链接之后使用此链接作为参考来调整我的代码。就我而言,我有一个分类图像任务和两个类,其中图像具有以下尺寸(128、160、3),我想尝试使用稀疏分类交叉熵找到超参数。

这个选择的原因是我需要使用这些参数进行知识蒸馏。但是,当我尝试使用此模型时,出现以下错误:

InvalidArgumentError: Graph execution error:

Detected at node 'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'

Node: 'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
logits and labels must have the same first dimension, got logits shape [2621440,2] and labels shape [128]
     [[{{node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_2374]

我想知道我的网络到底出了什么问题,这是一个最小的可重现示例:

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import backend as K
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import InputLayer, Input
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import load_model
from tensorflow.keras.callbacks import EarlyStopping
import random 
import skopt
from skopt import gp_minimize, forest_minimize
from skopt.space import Real, Categorical, Integer
X_train = np.array(np.random.randint(0, 1, size=(7000, 128, 160, 3)))
X_test = np.array(np.random.randint(0, 1, size=(1000, 128, 160, 3)))
a = [0] * 3500
b = [1] * 3500
c = [0] * 500
d = [1] * 500
y_train = np.array((a+b))
y_test =  np.array((c+d))


dim_learning_rate = Real(low=1e-6, high=1e-1, prior='log-uniform', name='learning_rate')

dim_num_dense_layers = Integer(low=1, high=10, name='num_dense_layers')

dim_num_dense_nodes = Integer(low=5, high=512, name='num_dense_nodes')

dim_activation = Categorical(categories=['relu', 'sigmoid'], name='activation')

dimensions = [dim_learning_rate, dim_num_dense_layers, dim_num_dense_nodes, dim_activation]

default_parameters = [1e-5, 1, 16, 'relu']


def create_model(learning_rate, num_dense_layers, num_dense_nodes, activation):
    model = Sequential()
    model.add(InputLayer(input_shape=(128, 160, 3)))
    for i in range(num_dense_layers):
        name = 'layer_dense_{0}'.format(i+1)

        model.add(Dense(num_dense_nodes, activation=activation, name=name))

    model.add(Dense(2, activation='softmax'))
    optimizer = Adam(lr=learning_rate)
    model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    
    return model



validation_data = (X_test, y_test)


from skopt.utils import use_named_args
@use_named_args(dimensions=dimensions)
def fitness(learning_rate, num_dense_layers, num_dense_nodes, activation):
    
    model = create_model(learning_rate=learning_rate, num_dense_layers=num_dense_layers, num_dense_nodes=num_dense_nodes, activation=activation)
    
    history = model.fit(x= X_train, y= y_train, epochs=3, batch_size=128, validation_data=validation_data)

    accuracy = history.history['val_accuracy'][-1]

    print()
    print("Accuracy: {0:.2%}".format(accuracy))
    print()
    return -accuracy


fitness(x = default_parameters)

或者,为了进行测试,我尝试使用 categorical_crossentropy 作为成本函数,得到以下结果:

ValueError: Shapes (None,) and (None, 128, 160, 2) are incompatible

我想知道为什么我的输入在两个成本函数中都没有被接受,以及我需要做哪些更改来为我的训练实施 sparse_categorical_crossentropy 和 categorical_crossentropy?

4

1 回答 1

0

你必须重塑你的列,然后设置测试拆分

于 2022-03-05T17:34:23.780 回答