c++ - CTC Beam Search 和 Tensorflow C++ API 的问题

Question

我已经冻结了一个 tensorflow 模型，它的最后一个节点是 ctc 光束搜索。使用 pyhton API 可以解释输出张量并转换为最终的标签序列。由于我想在 C++ 中使用这个冻结模型，我想知道如何使用 C++ API 来处理这个输出张量并获得最终的标签序列。使用 python API，我将此函数称为“sparse_tensor_to_str”，传递我在运行会话后获得的张量。在我的例子中，最后的标签序列是一串字符。

    def sparse_tensor_to_str(self, spares_tensor: tf.SparseTensor):
    """
    :param spares_tensor:
    :return: a str
    """
    indices = spares_tensor.indices
    values = spares_tensor.values
    values = np.array([self.__ord_map[str(tmp)] for tmp in values])
    dense_shape = spares_tensor.dense_shape

    number_lists = np.ones(dense_shape, dtype=values.dtype)
    str_lists = []
    res = []
    for i, index in enumerate(indices):
        number_lists[index[0], index[1]] = values[i]
    for number_list in number_lists:
        str_lists.append([self.int_to_char(val) for val in number_list])
    for str_list in str_lists:
        res.append(''.join(c for c in str_list if c != '*'))
    return res

在 C++ 中，我喜欢以下内容：

string input_layer = "input:0";
string output_layer = "CTCBeamSearchDecoder:0";
std::vector<Tensor> inputs;

Status read_tensor_status = ReadTensorFromMat(candidate_plates_mat[i],input_height,input_width,input_mean,input_std, &inputs);
 if (!read_tensor_status.ok()) {
    LOG(ERROR) << read_tensor_status;
    return;
 }

Tensor& resized_input_tensor = inputs[0];
std::vector<Tensor> outputs;
Status run_status = session->Run({{input_layer, resized_input_tensor}},{output_layer}, {}, &outputs);
if (!run_status.ok()) {
   LOG(ERROR) << "Running model failed: " << run_status;
   return;
}
std::cout<< outputs[0].tensor<tensorflow::int64, 2>() << std::endl

我得到一个像这样的 9x2 张量的输出张量：

[[0, 0],
   [0, 1],
   [0, 2],
   [0, 3],
   [0, 4],
   [0, 5],
   [0, 6],
   [0, 7],
   [0, 8]]

其中 9 是最终字符串的实际长度。在这里，我无法获得正确的信息，例如在 python 中，用于覆盖最终字符串。

score 0 · Accepted Answer

你解决问题了吗？我提供我的解决方案供您参考

在 python 中，你的代码应该看起来像

# remember to set seq_len to fit for your case
decoded, log_prob = tf.nn.ctc_beam_search_decoder(y, seq_len)
dense_decoded = tf.sparse_tensor_to_dense(decoded[0], default_value=-1)

在 cpp 中，您的代码应如下所示

// you need to modify "outputName" if your model have prefix variable scope name
// SparseToDense is the name of tf.sparse_tensor_to_dense function
std::string outputName = "SparseToDense:0";
outputLayerNames_ = {outputName};


std::vector<std::pair<std::string, tensorflow::Tensor>> inputDict_ = { std::make_pair(DefaultInputLayerName_, inputImageTensor_), 
                std::make_pair(DefaultTrainFlagName_, trainFlagTensor_),
                std::make_pair(DefaultSeqLenName_, inputSeqLenTensor_)};  
std::vector<tensorflow::Tensor> outputs_;
sess_->Run(inputDict_, outputLayerNames_, {}, &outputs_);
std::cout<< outputs_[0].tensor<tensorflow::int64, 2>() << std::endl;

请记住，您必须为tf.nn.ctc_beam_search_decoder的 sequence_length 设置 inputSeqLenTensor_ 否则您将一无所获

c++ - CTC Beam Search 和 Tensorflow C++ API 的问题

1 回答 1

Related

Reference