pre-trained-model - Allennlp：如何加载预训练的 ELMo 作为 allennlp 模型的嵌入？

Question

我是 allennlp 的新手。我训练了一个 elmo 模型，将其作为嵌入应用到其他 allennlp 模型，但失败了。看来我的模型与配置提供的界面不兼容。我能做些什么？

我的 elmo 由 allennlp 使用以下命令进行训练：

allennlp train config/elmo.jsonnet --serialization-dir /xxx

除了数据集和词汇表之外， elmo.jsonnet 与https://github.com/allenai/allennlp-models/blob/main/training_config/lm/bidirectional_language_model.jsonnet几乎相同。

之后，我得到了一个 elmo 模型：

config.json
weights.th
vocabulary/
vocabulary/.lock
vocabulary/non_padded_namespaces.txt
vocabulary/tokens.txt
meta.json

当我尝试在https://github.com/allenai/allennlp-models/blob/main/training_config/rc/bidaf_elmo.jsonnet中将模型加载到其他模型中时，我发现它需要选项和权重：

"elmo": {
    "type": "elmo_token_embedder",
    "do_layer_norm": false,
    "dropout": 0,
    "options_file": "xxx/options.json",
    "weight_file": "xxx/weights.hdf5"
}

我的模型中不包含哪些。我尝试更改model.state_dict()为 weights.hdf5 但收到错误消息：

KeyError: "Unable to open object (object 'char_embed' doesn't exist)"

哪个是必需的

File "/home/xxx/anaconda3/envs/thesis_torch1.8/lib/python3.8/site-packages/allennlp/modules/elmo.py", line 393, in _load_char_embedding
    char_embed_weights = fin["char_embed"][...]

看来我allennlp训练的模型和接口不兼容。如何将我的 elmo 用作其他模型的嵌入？

score 0 · Accepted Answer

你是对的，这两种格式不对齐。

恐怕没有捷径可走。我认为您必须编写一个TokenEmbedder可以读取和应用来自bidirectional_language_model.jsonnet.

如果您愿意，我们很乐意将其作为对 AllenNLP 的贡献！

pre-trained-model - Allennlp：如何加载预训练的 ELMo 作为 allennlp 模型的嵌入？

1 回答 1

Related

Reference