我想训练可分解注意力+ELMo;使用我自己的数据集进行演示的SNLI模型。我是 nlp 的新手。在阅读完指南之后,我仍然不知道如何从我自己的由纯文本前提、假设和标签组成的训练集开始。数据格式如下所示。
根据 demo 上的训练命令,我发现它的训练集是https://allennlp.s3.amazonaws.com/datasets/snli/snli_1.0_train.jsonl
. 如何使用自己的数据生成这样的训练集?
供参考。我的数据集是这样的:
{ "premise":"sentences", "hypothesis":"sentences", "label":"x"}
{ "premise":"sentences", "hypothesis":"sentences", "label":"y"}
...
输入snli_1.0_train.jsonl
是这样的:
{"annotator_labels": ["neutral"], "captionID": "3416050480.jpg#4", "gold_label": "neutral", "pairID": "3416050480.jpg#4r1n", "sentence1": "A person on a horse jumps over a broken down airplane.", "sentence1_binary_parse": "( ( ( A person ) ( on ( a horse ) ) ) ( ( jumps ( over ( a ( broken ( down airplane ) ) ) ) ) . ) )", "sentence1_parse": "(ROOT (S (NP (NP (DT A) (NN person)) (PP (IN on) (NP (DT a) (NN horse)))) (VP (VBZ jumps) (PP (IN over) (NP (DT a) (JJ broken) (JJ down) (NN airplane)))) (. .)))", "sentence2": "A person is training his horse for a competition.", "sentence2_binary_parse": "( ( A person ) ( ( is ( ( training ( his horse ) ) ( for ( a competition ) ) ) ) . ) )", "sentence2_parse": "(ROOT (S (NP (DT A) (NN person)) (VP (VBZ is) (VP (VBG training) (NP (PRP$ his) (NN horse)) (PP (IN for) (NP (DT a) (NN competition))))) (. .)))"}
如果有人能提供帮助,我真的很感激。谢谢。