2

我正在做一个长文本分类任务,文档中有超过 10000 个单词,我打算使用 Bert 作为段落编码器,然后将段落的嵌入逐步提供给 BiLSTM。网络如下:

输入:(batch_size,max_paragraph_len,max_tokens_per_para,embedding_size)

伯特层:(max_paragraph_len,paragraph_embedding_size)

lstm层:???

输出层:(batch_size,classification_size)

如何用 keras 实现它?我正在使用 keras 的 load_trained_model_from_checkpoint 来加载 bert 模型

bert_model = load_trained_model_from_checkpoint(
        config_path,
        model_path,
        training=False,
        use_adapter=True,
        trainable=['Encoder-{}-MultiHeadSelfAttention-Adapter'.format(i + 1) for i in range(layer_num)] +
            ['Encoder-{}-FeedForward-Adapter'.format(i + 1) for i in range(layer_num)] +
            ['Encoder-{}-MultiHeadSelfAttention-Norm'.format(i + 1) for i in range(layer_num)] +
            ['Encoder-{}-FeedForward-Norm'.format(i + 1) for i in range(layer_num)],
        )
4

1 回答 1

0

相信你可以查看下面的文章。作者展示了如何加载预训练的 BERT 模型,将其嵌入到 Keras 层中,并将其用于定制的深度神经网络。首先安装 google-research/bert 的 TensorFlow 2.0 Keras 实现:

pip install bert-for-tf2

然后运行:

import bert
import os

def createBertLayer():
    global bert_layer

    bertDir = os.path.join(modelBertDir, "multi_cased_L-12_H-768_A-12")

    bert_params = bert.params_from_pretrained_ckpt(bertDir)

    bert_layer = bert.BertModelLayer.from_params(bert_params, name="bert")

    bert_layer.apply_adapter_freeze()

def loadBertCheckpoint():

    modelsFolder = os.path.join(modelBertDir, "multi_cased_L-12_H-768_A-12")
    checkpointName = os.path.join(modelsFolder, "bert_model.ckpt")

    bert.load_stock_weights(bert_layer, checkpointName)
于 2020-05-01T21:06:59.087 回答