javascript - TensorFlow.js 预测时间是第一次试用和后续的差异

Question

我正在测试加载 TensorFlow.js 模型并尝试测量预测需要多少毫秒。例如，第一次，预测值大约需要 300 毫秒，但从第二次开始，时间减少到 13~20 毫秒。我不是从模型加载中计算时间。我只计算模型加载后的预测值。

谁能解释为什么预测价值的时间会减少？

// Calling TensorFlow.js model
const MODEL_URL = 'https://xxxx-xxxx-xxxx.xxx.xxx-xxxx-x.xxxxxx.com/model.json'
let model;
let prediction;
export async function getModel(input){
  console.log("From helper function: Model is being retrieved from the server...")
  model = await tf.loadLayersModel(MODEL_URL);

  // measure prediction time
  var str_time = new Date().getTime(); 
  prediction = await model.predict(input)
  var elapsed = new Date().getTime() - str_time;
  console.log("Laoding Time for Tensorflow: " + elapsed)        
    
  console.log(prediction.arraySync())
  ...
}

score 0 · Accepted Answer

由于需要从 API 请求将模型加载到内存中，通常第一次预测会花费更长的时间，一旦完成，它将被缓存，您不需要再次发出相同的 API 请求。

如果您想查看实际的预测时间，请重复多次（可能是 1000 次）计时预测的过程，然后获得第 99 个分位数值，该值将显示 99% 的情况下的预测时间是多少（您可以更改分位数值以及 90 或 50）。

javascript - TensorFlow.js 预测时间是第一次试用和后续的差异

1 回答 1

Related

Reference