tensorflow - TFX Evaluator 似乎无法识别 ResolverNode 的基线模型输出

Question

我想在 TFX 中使用模型 Evaluator 组件的验证功能（模型差异或模型比较），所以我使用了 TFX 中出租车模板的基本代码来做到这一点。

问题是，当 Evaluator 组件在 GCP 上的 Kubeflow 中运行时，会在日志中引发下一条错误消息：


    ERROR:absl:There are change thresholds, but the baseline is missing. This is allowed only when rubber stamping (first run).
    WARNING:absl:"maybe_add_baseline" and "maybe_remove_baseline" are deprecated,
            please use "has_baseline" instead.
    INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config=
    model_specs {
      name: "candidate"
      signature_name: "my_model_validation_signature"
      label_key: "n_trips"
    }
    slicing_specs {
    }
    metrics_specs {
      metrics {
        class_name: "MeanSquaredError"
        threshold {
          value_threshold {
            upper_bound {
              value: 10000.0
            }
          }
        }
      }
    }
    INFO:absl:ModelSpec name "candidate" is being ignored and replaced by "" because a single ModelSpec is being used

查看TFX repo 1第 138 行中 Evaluator 组件的执行程序的源代码：


has_baseline = bool(input_dict.get(BASELINE_MODEL_KEY))

然后在第 141 行输入一个函数：


eval_config = tfma.update_eval_config_with_defaults(eval_config, has_baseline=has_baseline)

然后仅在满足下一个条件时才抛出引用的错误消息，来自TFX repo 2

      if (not has_baseline and has_change_threshold(eval_config) and
          not rubber_stamp):
        # TODO(b/173657964): Raise an error instead of logging an error.
        logging.error('There are change thresholds, but the baseline is missing. '
                      'This is allowed only when rubber stamping (first run).')

事实上，这就是我在日志中得到的错误，模型被评估但没有与基线进行比较，即使我以示例代码所示的方式提供它，例如：


      eval_config = tfma.EvalConfig(
          model_specs=[tfma.ModelSpec(name=tfma.CANDIDATE_KEY,label_key='n_trips', 
                                      signature_name='my_model_validation_signature'),
                       tfma.ModelSpec(name=tfma.BASELINE_KEY, label_key='n_trips', 
                                      signature_name='my_model_validation_signature', is_baseline=True)
                      ],
          slicing_specs=[tfma.SlicingSpec()],
          metrics_specs=[
              tfma.MetricsSpec(metrics=[
                  tfma.MetricConfig(
                      class_name="MeanSquaredError",#'mean_absolute_error',
                      threshold=tfma.MetricThreshold(
                          value_threshold=tfma.GenericValueThreshold(
                              upper_bound={'value': 10000}),
                                                                   
                          change_threshold=tfma.GenericChangeThreshold(
                              direction=tfma.MetricDirection.LOWER_IS_BETTER,
                              relative={'value':1})
                                                    )
                                  )
              ])
          ])
    
      evaluator = Evaluator(
          examples=example_gen.outputs['examples'],
          model=trainer.outputs['model'],
          baseline_model=model_resolver.outputs['model'],
          # Change threshold will be ignored if there is no baseline (first run).
          eval_config=eval_config)
    
      # TODO(step 6): Uncomment here to add Evaluator to the pipeline.
      components.append(evaluator)

而且还在继续……

score 1 · Accepted Answer

通过从版本 0.26.0 升级到版本 0.27.0 解决了这个问题

出现问题是因为 Google Cloud Platform 中 Kubeflow Pipelines 中的默认笔记本安装了 0.26.0 版本...

tensorflow - TFX Evaluator 似乎无法识别 ResolverNode 的基线模型输出

1 回答 1

Related

Reference