pytorch - BERT 错误：无法压缩值不是 1 的维度

Question

我有一个基于 pytorch-pretrained-BERT 的 mlmodel，通过 ONNX 导出到 CoreML。这个过程非常顺利，所以现在我正在尝试做一些（非常）基本的测试——即，只是为了做出某种预测，并大致了解我们可能遇到的性能问题。

但是，当我尝试运行预测时，出现以下错误：

[espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Invalid state": Cannot squeeze a dimension whose  
value is not 1: shape[1]=128 stat2020-02-16   
11:36:05.959261-0800 Xxxxx[6725:2140794] [coreml] Error computing NN outputs -5

此错误是否表明模型本身存在问题（即，来自模型转换），还是在 Swift/CoreML 中我做错了什么？我的预测函数如下所示：

public func prediction(withInput input: String) -> MLMultiArray? {
        var predictions: MLMultiArray? = nil
        if let bert = bertMLModel {
            var ids = tokenizer.tokenizeToIds(text: input, includeWordpiece: true)
            print("ids: \(ids)")
            while ids.count < 256 {
                ids.append(0)
            }
            let inputMLArray = MLMultiArray.from(ids, dims: 2)
            let modelInput = bert_256_FP16Input(input_ids: inputMLArray)
            var modelOutput: bert_256_FP16Output? = nil
            do {
                modelOutput = try bert.prediction(input: modelInput)
            } catch {
                print("Error running prediction: \(error)")
            }
            if let modelOutput = modelOutput {
                predictions = modelOutput.output_weights
            }
        }
        return predictions
    }

在这个阶段，我不想做任何有意义的事情，只是为了让它运行。

我使用了 pytorch-pretrained-BERT 存储库，因为我能够找到一个基础的预训练示例。从那以后我注意到 HuggingFace 已经发布了一个“从头开始”的培训选项，但是教程中仍有一些问题正在整理中。所以我想至少了解我当前的模型/方法可能出了什么问题。但是，如果问题肯定出在 PyTorch->ONNX->CoreML 转换中，那么我真的不想打这场仗，只会深入研究 HuggingFace 提供的东西。

任何想法表示赞赏。

更新：根据 Matthijs 的建议，我试图从 python 中的模型进行预测：

from coremltools.models import *
import numpy as np

# Just a "dummy" input, but it is a valid series of tokens for my data.
tokens = [3, 68, 45, 68, 45,  5, 45, 68, 45,  4]
tokens_tensor = np.zeros((1, 128))
for i in range(0, 10):
    tokens_tensor[0, i] = tokens[I]
# I'm doing masked token prediction, so one segment.
segments_tensor = np.zeros((1, 128))

mlmodel = 'bert_fp16.mlmodel'

model = MLModel(mlmodel)
spec = model.get_spec()
print("spec: ", spec.description)

predictions = model.predict({'input.1': np.asarray(tokens_tensor, dtype=np.int32), 'input.3': np.asarray(segments_tensor, dtype=np.int32)})

我承认我之前没有从 python 运行过 mlmodel，但我认为输入是正确的。规范表明：

input {
  name: "input.1"
  type {
    multiArrayType {
      shape: 1
      shape: 128
      dataType: INT32
    }
  }
}
input {
  name: "input.3"
  type {
    multiArrayType {
      shape: 1
      shape: 128
      dataType: INT32
    }
  }
}
...

对于我的投入。

对于这种情况，我没有收到cannot squeeze消息或错误代码 (-5)，但它确实以Error computing NN outputs. 所以肯定有问题。我只是完全不确定如何调试它。

J。

更新：为了比较，我已经训练/转换了 HuggingFace BERT 模型（实际上是 DistilBert——我已经相应地更新了上面的代码）并且有同样的错误。查看来自 onnx 的日志，我看到Squeeze添加了一个（当然，从 onnx-coreml 日志中也很清楚），但squeezePyTorch 代码中唯一的一个是 in BertForQuestionAnswering，而不是BertForMaskedLM. 也许 onnx 正在构建问答模型，而不是 mlm 模型（或者问答模型被保存在检查点中）？

查看 swift-coreml-transformers 示例代码，我可以看到 distilBert 的输入只是let input_ids = MLMultiArray.from(allTokens, dims: 2)，这正是我定义它的方式。所以我想，我处于死胡同。有没有人设法在 CoreML 中使用 Bert/DistilBert 运行 MLM（通过 onnx）？如果是这样，一个例子将非常有帮助。

顺便说一句，我正在使用run_language_modeling.pyHuggingFace 的最新版本进行训练。

查看 Netron 中的 mlmodel，我可以看到有问题的squeeze. 我承认我不知道输入的那个分支是做什么用的，但我猜它是一个掩码（进一步的猜测可能是它与问答有关）。我可以以某种方式删除它吗？

pytorch - BERT 错误：无法压缩值不是 1 的维度

0 回答 0

Related

Reference