我尝试使用带有 Shap 库的 Keras 获取经典神经网络的特征重要性,但出现以下错误: ValueError: Layersequential_1 was called with an input that is not a symbolic tensor。我查看了论坛,但答案仅适用于卷积网络。请在下面找到我的代码。
import pandas as pd
import pickle
import numpy as np
from sklearn.utils import shuffle
# Train
dataset_train_shuffle = shuffle(list_dataset_train[0], random_state = 24)
dataset_train_shuffle = dataset_train_shuffle.reset_index(drop=True)
X_train = dataset_train_shuffle.iloc[:,1:8]
label_train = dataset_train_shuffle.iloc[:,[-1]]
# Validation
X_validation = list_dataset_validation[0]
X_validation = X_validation.iloc[:,1:8]
label_validation = list_dataset_validation[0]
label_validation = label_validation.iloc[:,[-1]]
# Test
X_test = list_dataset_test[0]
X_test = X_test.iloc[:,1:8]
label_test = list_dataset_test[0]
label_test = label_test.iloc[:,[-1]]
我的 X 是具有以下形状的数据框:
BookEquityToMarketEquity Market ... EPSGrowth1yrFwd LowVolatility
0 -0.725018 -0.531440 ... 0.551760 -1.111092
1 0.622943 -0.372537 ... -0.036427 -0.391065
2 -1.123209 2.099897 ... 1.885993 -1.762509
3 -3.047993 2.582608 ... 2.272227 -2.906862
4 0.461661 0.562763 ... -0.524000 -0.155260
... ... ... ... ...
3007 -1.466322 -2.234277 ... -0.493226 1.712511
3008 0.061376 0.294030 ... 0.411817 -0.057478
3009 0.807521 0.357246 ... -0.169811 -0.713736
3010 -0.396623 0.320133 ... -0.096492 -0.287331
3011 -1.308371 1.074483 ... 1.447048 -1.062359
我的标签是具有以下形状的数据框:
NYSE:AEE
0 0
1 0
2 0
3 0
4 1
...
3007 0
3008 0
3009 0
3010 0
3011 1
我的模型如下:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout
from keras import optimizers
import tensorflow as tf
model = Sequential()
model.add(Dense(32,input_dim=len(X_train.columns), activation = 'relu',))
model.add(Dropout(0.25))
model.add(Dense(16, activation = 'relu'))
model.add(Dropout(0.25))
model.add(Dense(8, activation ='relu'))
model.add(Dropout(0.25))
model.add(Dense(1,activation ='sigmoid'))
model.compile(loss = 'binary_crossentropy',
optimizer = 'adam',
metrics = [tf.keras.metrics.AUC()],
)
model.fit(X_train,
label_train,
validation_data = (X_validation, label_validation),
epochs = 100,
batch_size = 50,
verbose = 1,
)
当我尝试获取功能重要性时,我遇到了 DeepExplainer 的问题:
background = X_train[:1000]
explainer = shap.DeepExplainer(model, background)
shap_values = explainer.shap_values(X_test)
shap.force_plot(explainer.expected_value, shap_values[0,:], X_train.iloc[0,:])
ValueError: Layer sequential_1 was called with an input that isn't a symbolic tensor. Received type: <class 'pandas.core.frame.DataFrame'>. Full input: [ BookEquityToMarketEquity Market ... EPSGrowth1yrFwd LowVolatility
0 -0.725018 -0.531440 ... 0.551760 -1.111092
1 0.622943 -0.372537 ... -0.036427 -0.391065
2 -1.123209 2.099897 ... 1.885993 -1.762509
3 -3.047993 2.582608 ... 2.272227 -2.906862
4 0.461661 0.562763 ... -0.524000 -0.155260
.. ... ... ... ... ...
995 -1.552939 -0.102533 ... 0.852491 -0.383818
996 1.311711 1.659371 ... 1.028700 -0.967370
997 1.013556 -1.029374 ... -1.386222 0.319806
998 0.374137 -1.736694 ... -0.433354 -0.220381
999 0.353116 -0.631120 ... -0.227051 0.475108
[1000 rows x 7 columns]]. All inputs to the layer should be tensors.
有人有想法吗?在此先感谢您的帮助。