performance - Tensorflow Estimator 预测很慢

Question

我训练了一个 tf.estimator.LinearClassifier。虽然针对我的数据大小（约 60 秒）训练和评估模型需要合理的时间，但预测需要更长的数量级（约 1 小时）。

预测代码如下：

predictionResult = estimator.predict(input_fn=lambda: my_input_fn2(predictionValidationFile, False, 1))
predictionList = [prediction for prediction in  predictionResult]

和：

def my_input_fn2(file_path, perform_shuffle=False, repeat_count=1):
def _parse_function(example_proto):      
  keys_to_features = {"xslm": tf.FixedLenFeature([10000], tf.float32),
                      "xrnn": tf.FixedLenFeature([10000], tf.float32),
                      "target": tf.FixedLenFeature([10000], tf.float32)}
  parsed_features = tf.parse_single_example(example_proto, keys_to_features)      
  myfeatures = {'xrnn':parsed_features['xrnn'], 'xslm':parsed_features['xslm']}
  return myfeatures, parsed_features['target'] 

dataset = (tf.data.TFRecordDataset(file_path)                
           .map(_parse_function))     
dataset = dataset.repeat(repeat_count) 
dataset = dataset.batch(1)  
iterator = dataset.make_one_shot_iterator()
batch_feature,  batch_labels = iterator.get_next()    
xs= tf.reshape(batch_feature['xslm'],[-1,1])
xr= tf.reshape(batch_feature['xrnn'],[-1,1])
x = {'xrnn':xr, 'xslm':xs}
y = tf.reshape(batch_labels, [-1,1])
return x, y

当运行 10 000 个样本（对应于一批）时，第二行需要 0.8 秒才能执行。对于 50 000 000 个样本，预测需要一个多小时。

我在这个阶段的猜测是，这种缓慢的性能仅仅是因为估计器 predict() 函数返回一个 python 生成器而不是返回实际的预测结果。对于每个批次，生成器最终会导致 10 000 次函数调用以获得 10 000 个预测结果。这似乎效率低下。

有什么办法可以加快速度吗？

score 0 · Accepted Answer

您对它缓慢的原因是正确的。它正在为每个项目调用函数，因为您在函数中的 bach 大小默认为 1。

您应该将批量大小作为参数传递给函数并替换

dataset = dataset.batch(1)

和

dataset = dataset.batch(batch_size)

score 0 · Accepted Answer

我有一个类似的问题（在 colab 笔记本中使用 tensorflow 1.15）。就我而言，保存和加载模型（在新单元格中）解决了这个问题。

model.save_weights("weights.h5", overwrite=True)
# in a new cell
model = create_model()
model.load_weights("weights.h5")
y_pred = np.array(model.predict(x_test))

performance - Tensorflow Estimator 预测很慢

2 回答 2

Related

Reference