0

我正在使用深度学习概念,但它是初学者,我正在尝试使用 3 个深度神经网络模型构建特征融合概念,我的想法是我试图从所有三个模型中获取特征并在最后一个 sigmoid 层,然后得到结果,这是我运行的代码。

代码:

from keras.layers import Input, Dense
from keras.models import Model
from sklearn.model_selection import train_test_split
import numpy
# random seed for reproducibility
numpy.random.seed(2)
# loading load pima indians diabetes dataset, past 5 years of medical history
dataset = numpy.loadtxt('https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv', delimiter=",")
# split into input (X) and output (Y) variables, splitting csv data
X = dataset[:, 0:8]
Y = dataset[:, 8]
x_train, x_validation, y_train, y_validation = train_test_split(X, Y, test_size=0.20, random_state=5)
#create the input layer
input_layer = Input(shape=(8,))
A2 = Dense(8, activation='relu')(input_layer)
A3 = Dense(30, activation='relu')(A2)
B2 = Dense(40, activation='relu')(A2)
B3 = Dense(30, activation='relu')(B2)
C2 = Dense(50, activation='relu')(B2)
C3 = Dense(5, activation='relu')(C2)
merged = Model(inputs=[input_layer],outputs=[A3,B3,C3])
final_model = Dense(1, 
activation='sigmoid')(merged
final_model.compile(loss="binary_crossentropy",
              optimizer="adam", metrics=['accuracy'])
# call the function to fit to the data (training the network)
final_model.fit(x_train, y_train, epochs=2000, batch_size=50,
          validation_data=(x_validation, y_validation))
# evaluate the model
scores = final_model.evaluate(x_validation,y_validation)
print("\n%s: %.2f%%" % (final_model.metrics_names[1], scores[1] * 100))

这是我面临的错误

if x.shape.ndims is None:

AttributeError: 'Functional' object has no attribute 'shape'

请帮我解决这个问题,或者如果有人知道我应该使用什么代码然后让我知道我也愿意更改代码但不是概念谢谢。


更新

从@M.Innat 的回答中,我们尝试如下。我们的想法是我们首先构建 3 个模型,然后通过将这些模型与单个分类器连接来构建最终/组合模型。但我面临一个差异。当我训练每个模型时,它们给出了 90% 的结果,但是当我将它们结合起来时,它们几乎没有达到 60 或 70。

代码模型 1:

   model = Sequential()
    # input layer requires input_dim param
    model.add(Dense(10, input_dim=8, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(5, activation='relu'))
    # sigmoid instead of relu for final probability between 0 and 1
    model.add(Dense(1, activation='sigmoid'))
    
    # compile the model, adam gradient descent (optimized)
    model.compile(loss="binary_crossentropy",
                  optimizer="adam", metrics=['accuracy'])
    
    # call the function to fit to the data (training the network)
    model.fit(x_train, y_train, epochs=1000, batch_size=50,
              validation_data=(x_validation, y_validation))
    
    # evaluate the model
    
    scores = model.evaluate(X, Y)
    print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1] * 100))
    model.save('diabetes_risk_nn.h5')

模型 1 精度= 94.14%。与另外 2 个模型相同。

模型 2 精度= 93.62% 模型 3 精度= 92.71%

接下来,正如@M.Innat 建议的合并模型。在这里,我们使用上述模型 1、2、3 完成了该操作。但分数并不接近~90%。最终组合模型:

# Define Model A 
input_layer = Input(shape=(8,))
A2 = Dense(10, activation='relu')(input_layer)
A3 = Dense(50, activation='relu')(A2)
A4 = Dense(50, activation='relu')(A3)
A5 = Dense(50, activation='relu')(A4)
A6 = Dense(50, activation='relu')(A5)
A7 = Dense(50, activation='relu')(A6)
A8 = Dense(5, activation='relu')(A7)
model_a = Model(inputs=input_layer, outputs=A8, name="ModelA")

# Define Model B 
input_layer = Input(shape=(8,))
B2 = Dense(10, activation='relu')(input_layer)
B3 = Dense(50, activation='relu')(B2)
B4 = Dense(40, activation='relu')(B3)
B5 = Dense(60, activation='relu')(B4)
B6 = Dense(30, activation='relu')(B5)
B7 = Dense(50, activation='relu')(B6)
B8 = Dense(50, activation='relu')(B7)
B9 = Dense(5, activation='relu')(B8)
model_b = Model(inputs=input_layer, outputs=B9, name="ModelB")

# Define Model C
input_layer = Input(shape=(8,))
C2 = Dense(10, activation='relu')(input_layer)
C3 = Dense(50, activation='relu')(C2)
C4 = Dense(40, activation='relu')(C3)
C5 = Dense(40, activation='relu')(C4)
C6 = Dense(70, activation='relu')(C5)
C7 = Dense(50, activation='relu')(C6)
C8 = Dense(50, activation='relu')(C7)
C9 = Dense(60, activation='relu')(C8)
C10 = Dense(50, activation='relu')(C9)
C11 = Dense(5, activation='relu')(C10)
model_c = Model(inputs=input_layer, outputs=C11, name="ModelC")
all_three_models = [model_a, model_b, model_c]
all_three_models_input = Input(shape=all_three_models[0].input_shape[1:])

然后将这三个结合起来。

models_output = [model(all_three_models_input) for model in all_three_models]
Concat           = tf.keras.layers.concatenate(models_output, name="Concatenate")
final_out     = Dense(1, activation='sigmoid')(Concat)
final_model   = Model(inputs=all_three_models_input, outputs=final_out, name='Ensemble')
#tf.keras.utils.plot_model(final_model, expand_nested=True)
final_model.compile(loss="binary_crossentropy",
              optimizer="adam", metrics=['accuracy'])
# call the function to fit to the data (training the network)
final_model.fit(x_train, y_train, epochs=1000, batch_size=50,
          validation_data=(x_validation, y_validation))

# evaluate the model

scores = final_model.evaluate(x_validation,y_validation)
print("\n%s: %.2f%%" % (final_model.metrics_names[1], scores[1] * 100))
final_model.save('diabetes_risk_nn.h5')

但与他们给出 90% 的每个模型不同,这个组合最终模型给出了大约=70%

4

2 回答 2

1

我想输出层就是Dense(1, activation='sigmoid'). 所以尝试这样的事情

# ...
merged = tf.keras.layers.concatenate([A3,B3,C3])
out = Dense(1, activation='sigmoid')(merged)
model = (input_layer, out)

model.fit(x_train, y_train, ...)
于 2021-04-01T13:30:00.303 回答
0

根据您的代码,只有一个模型(不是三个)。通过查看您尝试过的输出,我认为您正在寻找这样的东西:

数据集

import tensorflow as tf 
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from sklearn.model_selection import train_test_split
import numpy

# random seed for reproducibility
numpy.random.seed(2)
# loading load pima indians diabetes dataset, past 5 years of medical history
dataset = numpy.loadtxt('https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv', delimiter=",")

# split into input (X) and output (Y) variables, splitting csv data
X = dataset[:, 0:8]
Y = dataset[:, 8]

x_train, x_validation, y_train, y_validation = train_test_split(X, Y, test_size=0.20, random_state=5)

模型

#create the input layer
input_layer = Input(shape=(8,))

A2 = Dense(8, activation='relu')(input_layer)
A3 = Dense(30, activation='relu')(A2)

B2 = Dense(40, activation='relu')(input_layer)
B3 = Dense(30, activation='relu')(B2)

C2 = Dense(50, activation='relu')(input_layer)
C3 = Dense(5, activation='relu')(C2)


merged = tf.keras.layers.concatenate([A3,B3,C3])
final_out = Dense(1, activation='sigmoid')(merged)

final_model = Model(inputs=[input_layer], outputs=final_out)
tf.keras.utils.plot_model(final_model)

在此处输入图像描述

火车

final_model.compile(loss="binary_crossentropy",
              optimizer="adam", metrics=['accuracy'])

# call the function to fit to the data (training the network)
final_model.fit(x_train, y_train, epochs=5, batch_size=50,
          validation_data=(x_validation, y_validation))

# evaluate the model
scores = final_model.evaluate(x_validation,y_validation)
print("\n%s: %.2f%%" % (final_model.metrics_names[1], scores[1] * 100))
Epoch 1/5
13/13 [==============================] - 1s 15ms/step - loss: 0.7084 - accuracy: 0.6803 - val_loss: 0.6771 - val_accuracy: 0.6883
Epoch 2/5
13/13 [==============================] - 0s 5ms/step - loss: 0.6491 - accuracy: 0.6600 - val_loss: 0.5985 - val_accuracy: 0.6623
Epoch 3/5
13/13 [==============================] - 0s 5ms/step - loss: 0.6161 - accuracy: 0.6813 - val_loss: 0.6805 - val_accuracy: 0.6883
Epoch 4/5
13/13 [==============================] - 0s 5ms/step - loss: 0.6335 - accuracy: 0.7003 - val_loss: 0.6115 - val_accuracy: 0.6623
Epoch 5/5
13/13 [==============================] - 0s 5ms/step - loss: 0.5684 - accuracy: 0.7285 - val_loss: 0.6150 - val_accuracy: 0.6883
5/5 [==============================] - 0s 2ms/step - loss: 0.6150 - accuracy: 0.6883

accuracy: 68.83%

更新

根据您的此评论:

让我向您解释一下我要做什么,首先我分别创建 3 个模型 DNN,然后我尝试组合这些模型以获得所有特征,然后我想使用所有提取的特征进行分类,然后评估准确性。这就是我真正想要开发的。

  • 分别创建 3 个模型 - 好的,3 个模型
  • 将它们结合起来得到一个特征——好的,特征提取器
  • 分类 - 好的,平均模型输出特征图并传递给分类器 - 换句话说,Ensembling。

我们开工吧。首先,分别构建三个模型。

# Define Model A 
input_layer = Input(shape=(8,))
A2 = Dense(8, activation='relu')(input_layer)
A3 = Dense(30, activation='relu')(A2)
C3 = Dense(5, activation='relu')(A3)
model_a = Model(inputs=input_layer, outputs=C3, name="ModelA")

# Define Model B 
input_layer = Input(shape=(8,))
A2 = Dense(8, activation='relu')(input_layer)
A3 = Dense(30, activation='relu')(A2)
C3 = Dense(5, activation='relu')(A3)
model_b = Model(inputs=input_layer, outputs=C3, name="ModelB")

# Define Model C
input_layer = Input(shape=(8,))
A2 = Dense(8, activation='relu')(input_layer)
A3 = Dense(30, activation='relu')(A2)
C3 = Dense(5, activation='relu')(A3)
model_c = Model(inputs=input_layer, outputs=C3, name="ModelC")

我用了相同数量的参数,自己改。无论如何,这三个模型作为每个特征提取器(而不是分类器)执行。接下来,我们将通过对它们进行平均来组合它们的输出,然后将其传递给分类器。

all_three_models = [model_a, model_b, model_c]
all_three_models_input = Input(shape=all_three_models[0].input_shape[1:])


models_output = [model(all_three_models_input) for model in all_three_models]
Avg           = tf.keras.layers.average(models_output, name="Average")
final_out     = Dense(1, activation='sigmoid')(Avg)
final_model   = Model(inputs=all_three_models_input, outputs=final_out, name='Ensemble')
tf.keras.utils.plot_model(final_model, expand_nested=True)

在此处输入图像描述

现在,您可以训练模型并在测试集上对其进行评估。希望这可以帮助。


更多信息。

(1)。您可以添加种子。

from tensorflow.keras.models import Model
from sklearn.model_selection import train_test_split
import tensorflow as tf 
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from sklearn.model_selection import train_test_split
import os, numpy

# random seed for reproducibility
numpy.random.seed(101)
tf.random.set_seed(101)
os.environ['TF_CUDNN_DETERMINISTIC'] = '1'

dataset = .. your data 

# split into input (X) and output (Y) variables, splitting csv data
X = dataset[:, 0:8]
Y = dataset[:, 8]
x_train, x_validation, y_train, y_validation = train_test_split(X, Y,
                                                                
                                            test_size=0.20, random_state=101)

(2)。尝试使用SGD优化器。另外,使用ModelCheckpoint回调来保存最高的validation accuracy.

final_model.compile(loss="binary_crossentropy",
              optimizer="sgd", metrics=['accuracy'])

model_save = tf.keras.callbacks.ModelCheckpoint(
                'merge_best.h5',
                monitor="val_accuracy",
                verbose=0,
                save_best_only=True,
                save_weights_only=True,
                mode="max",
                save_freq="epoch"
            )

# call the function to fit to the data (training the network)
final_model.fit(x_train, y_train, epochs=1000, batch_size=256, callbacks=[model_save],
          validation_data=(x_validation, y_validation))

在测试集上进行评估。

# evaluate the model
final_model.load_weights('merge_best.h5')
scores = final_model.evaluate(x_validation,y_validation)
print("\n%s: %.2f%%" % (final_model.metrics_names[1], scores[1] * 100))
5/5 [==============================] - 0s 4ms/step - loss: 0.6543 - accuracy: 0.7662

accuracy: 76.62%
于 2021-04-02T06:23:44.897 回答