python - 使用带有 RepeatDataset 和 BatchDataset 类型对象的 SHAP 解释使用 BERT 构建的模型

Question

我使用预训练的 BERT 权重构建了一个有点复杂的模型。模型结构如下：

Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_ids (InputLayer)          [(None, 32)]         0                                            
__________________________________________________________________________________________________
attention_mask (InputLayer)     [(None, 32)]         0                                            
__________________________________________________________________________________________________
token_type_ids (InputLayer)     [(None, 32)]         0                                            
__________________________________________________________________________________________________
tf_bert_model_1 (TFBertModel)   ((None, 32, 768), (N 109482240   input_ids[0][0]                  
                                                                 attention_mask[0][0]             
                                                                 token_type_ids[0][0]             
__________________________________________________________________________________________________
dropout_76 (Dropout)            (None, 32, 768)      0           tf_bert_model_1[0][0]            
__________________________________________________________________________________________________
lstm_1 (LSTM)                   (None, 256)          1049600     dropout_76[0][0]                 
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 128)          32896       lstm_1[0][0]                     
__________________________________________________________________________________________________
dropout_77 (Dropout)            (None, 128)          0           dense_5[0][0]                    
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 64)           8256        dropout_77[0][0]                 
__________________________________________________________________________________________________
dense_7 (Dense)                 (None, 32)           2080        dense_6[0][0]                    
__________________________________________________________________________________________________
dense_8 (Dense)                 (None, 16)           528         dense_7[0][0]                    
__________________________________________________________________________________________________
dense_9 (Dense)                 (None, 7)            119         dense_8[0][0]                    
==================================================================================================
Total params: 110,575,719
Trainable params: 110,575,719
Non-trainable params: 0
__________________________________________________________________________________________________

我使用了 RepeatDataset 类型的对象将数据提供给模型。它是使用以下代码创建的：

train_ds = tf.data.Dataset.from_tensor_slices((train_inp,train_mask,train_type_ids,train_out)).map(convert_to_features).shuffle(100).batch(BATCH_SIZE).repeat(5)
type(test_ds)

所以类型是：tensorflow.python.data.ops.dataset_ops.RepeatDataset. 现在我想使用SHAP添加模型解释。

我已经尝试使用DeepExplainer. 它试图获取数据的形状，在我的例子中是train_ds. 但作为一种RepeateDataset对象，它没有形状属性。我怎样才能克服模型？或者有没有其他方法可以将 SHAP 与RepeatDataset类型对象一起使用？

python - 使用带有 RepeatDataset 和 BatchDataset 类型对象的 SHAP 解释使用 BERT 构建的模型

0 回答 0

Related

Reference