1

我正在使用具有 14 个类的 tf.keras 和 horovod 训练一个多标签分类器。AucRoc 被用作评估分类器性能的指标。我希望能够使用此处提到的 scikit learn 的 AucRoc 计算器:如何在 keras 中计算接收操作特性 (ROC) 和 AUC?. 如果我按原样为以下功能提供张量:

def sci_auc_roc(y_true, y_pred):
    return tf.py_func(roc_auc_score(y_true, y_pred), tf.double)

我收到如下所示的错误:

/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py:265: UserWarning: The output shape of `ResNet50(include_top=False)` has been changed since Keras 2.2.0.
  warnings.warn('The output shape of `ResNet50(include_top=False)` '
Traceback (most recent call last):
  File "official_resnet_tf_1.12.0_auc.py", line 531, in <module>
    main()
  File "official_resnet_tf_1.12.0_auc.py", line 420, in main
    model = chexnet_model(FLAGS)
  File "official_resnet_tf_1.12.0_auc.py", line 375, in chexnet_model
    metrics=[tf_auc_roc,sci_auc_roc])
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py", line 474, in _method_wrapper
    method(self, *args, **kwargs)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 648, in compile
    sample_weights=self.sample_weights)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 313, in _handle_metrics
    output, output_mask))
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 270, in _handle_per_output_metrics
    y_true, y_pred, weights=weights, mask=mask)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py", line 598, in weighted
    score_array = fn(y_true, y_pred)
  File "official_resnet_tf_1.12.0_auc.py", line 327, in sci_auc_roc
    return tf.py_func(roc_auc_score(y_true, y_pred), tf.double)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 349, in roc_auc_score
    y_type = type_of_target(y_true)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/sklearn/utils/multiclass.py", line 243, in type_of_target
    'got %r' % y)
ValueError: Expected array-like (array or non-string sequence), got <tf.Tensor 'dense_target:0' shape=(?, ?) dtype=float32>

我正在尝试将 tf 张量转换为 numpy 数组,然后将它们提供给 roc_auc_score 方法,如下所示:

def sci_auc_roc(y_true, y_pred):
    with tf.Session() as sess:
        y_true, y_pred = sess.run([y_true, y_pred])
    return tf.py_func(roc_auc_score(y_true, y_pred), tf.double)

我收到以下错误:

 warnings.warn('The output shape of `ResNet50(include_top=False)` '
Traceback (most recent call last):
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,256,256,3]
         [[{{node input_1}} = Placeholder[dtype=DT_FLOAT, shape=[?,256,256,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
         [[{{node dense_target/_5}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2237_dense_target", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "official_resnet_tf_1.12.0_auc.py", line 531, in <module>
    main()
  File "official_resnet_tf_1.12.0_auc.py", line 420, in main
    model = chexnet_model(FLAGS)
  File "official_resnet_tf_1.12.0_auc.py", line 375, in chexnet_model
    metrics=[tf_auc_roc,sci_auc_roc])
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py", line 474, in _method_wrapper
    method(self, *args, **kwargs)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 648, in compile
    sample_weights=self.sample_weights)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 313, in _handle_metrics
    output, output_mask))
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 270, in _handle_per_output_metrics
    y_true, y_pred, weights=weights, mask=mask)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py", line 598, in weighted
    score_array = fn(y_true, y_pred)
  File "official_resnet_tf_1.12.0_auc.py", line 324, in sci_auc_roc
    y_true, y_pred = sess.run([y_true, y_pred])
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,256,256,3]
         [[node input_1 (defined at /mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py:214)  = Placeholder[dtype=DT_FLOAT, shape=[?,256,256,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
         [[{{node dense_target/_5}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2237_dense_target", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'input_1', defined at:
  File "official_resnet_tf_1.12.0_auc.py", line 531, in <module>
    main()
  File "official_resnet_tf_1.12.0_auc.py", line 420, in main
    model = chexnet_model(FLAGS)
  File "official_resnet_tf_1.12.0_auc.py", line 339, in chexnet_model
    input_shape=(FLAGS.image_size, FLAGS.image_size, 3))
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/applications/__init__.py", line 70, in wrapper
    return base_fun(*args, **kwargs)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/applications/resnet50.py", line 32, in ResNet50
    return resnet50.ResNet50(*args, **kwargs)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py", line 214, in ResNet50
    img_input = layers.Input(shape=input_shape)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_layer.py", line 229, in Input
    input_tensor=tensor)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_layer.py", line 112, in __init__
    name=self.name)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1747, in placeholder
    return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 5206, in placeholder
    "Placeholder", dtype=dtype, shape=shape, name=name)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,256,256,3]
         [[node input_1 (defined at /mnt/lustrefs/rakvee/miniconda3/envs/docker_pip2/lib/python3.6/site-packages/keras_applications/resnet50.py:214)  = Placeholder[dtype=DT_FLOAT, shape=[?,256,256,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
         [[{{node dense_target/_5}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2237_dense_target", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[52342,1],0]
  Exit code:    1
--------------------------------------------------------------------------

我也尝试过 tensorflow 的https://www.tensorflow.org/api_docs/python/tf/metrics/auc,如下所示:

def tf_auc_roc(y_true, y_pred):
    auc = tf.metrics.auc(y_true, y_pred)[1]
    K.get_session().run(tf.local_variables_initializer())
    return auc

它工作得很好。但是,它给了我一个 aucroc 的数字。我想知道这个数字代表什么,它是所有 14 个类的平均 aucroc 值吗?或所有课程的最大 aucscores?或者它是如何得到一个数字的?

1216/1216 [==============================] - 413s 340ms/step - loss: 0.1513 - tf_auc_roc: 0.7944 - val_loss: 0.2212 - val_tf_auc_roc: 0.8074
Epoch 2/15
 582/1216 [=============>................] - ETA: 3:16 - loss: 0.1459 - tf_auc_roc: 0.8053

1) 如何修复 roc_auc_score 的错误?

2)那个数字代表什么?

4

1 回答 1

0

我认为度量的结果应该是一个单一的张量值,它代表 Keras 文档中描述的结果的平均值我发现它是比 TensorFlow 更好的文档)。

您可以改为使用自定义回调来实现您想要的结果,很可能您希望将结果写入磁盘 on_epoch_end

于 2019-10-31T22:19:39.317 回答