tensorflow - 在 CNN 的图像管道中使用 `tf.to_float()` 或 `tf.image.convert_image_dtype()`？

Question

我正在tf.slim使用此文件作为模板vgg_preprocessing.py修改示例。

tf.slim当我使用笔记本 ( slim_walkthrough.ipynb )中的剪辑从 TFRecord 文件中读取数据时，我得到一张颜色失真的图像。当预处理脚本使用tf.to_float()将图像张量从更改为tf.uint8时，就会发生这种情况tf.float32。

image = tf.to_float(image)

image = tf.image.convert_image_dtype(image, dtype=tf.float32)

通过 CNN 运行后，这些差异是否重要？如果是这样，哪一个更适合Vgg16图像处理管道？如果我切换到不同的预训练模型有关系Inception吗？

这是完整的方法：

# tf.to_float() and tf.image.convert_image_dtype() give different results
def preprocess_for_train(image,
                     output_height,
                     output_width):
  # randomly crop to 224x244
  image = _random_crop([image], output_height, output_width)[0]
  image.set_shape([output_height, output_width, 3])

  image = tf.to_float(image)
  # image = tf.image.convert_image_dtype(image, dtype=tf.float32)

  image = tf.image.random_flip_left_right(image)
  return image

score 3 · Accepted Answer

首先，看一下代码：

img_tensor = tf.image.decode_jpeg(img_raw)
print(img_tensor.shape)
print(img_tensor.dtype)
print(img_tensor.numpy().max())
   
a = tf.image.convert_image_dtype(img_tensor, dtype=tf.float32)
print(a.numpy().max())
print(a.shape)
print(a.dtype)

b = tf.to_float(img_tensor)
print(b.numpy().max())
print(b.shape)
print(b.dtype)

c = tf.cast(img_tensor,dtype=tf.float32)
print(c.numpy().max())
print(c.shape)
print(c.dtype)

结果是：

(28, 28, 3)
<dtype: 'uint8'>
149

## for tf.image.convert_image_dtype
0.58431375
(28, 28, 3)
<dtype: 'float32'>

## for tf.to_float
WARNING:tensorflow:From <ipython-input-6-c51a71006d6e>:13: to_float (from 
tensorflow.python.ops.math_ops) is deprecated and will be removed in a future 
version.
Instructions for updating:
Use tf.cast instead.
149.0
(28, 28, 3)
<dtype: 'float32'>

## for tf.cast 
149.0
(28, 28, 3)
<dtype: 'float32'>

从上面的代码和结果，你可以得到

tf.to_float 已弃用，因此建议使用 tf.cast ；
tf.to_float add multiply 1/255.0 等于 tf.image.convert_image_dtype 操作；

所以，在我看来，没有太大的区别。

顺便说一下，TF版本是：1.13.1。

score 1 · Accepted Answer

我意识到我的问题完全不同。

上述问题的答案是：

tf.to_float([1,2,3])只生产[1.,2.,3.]
tf.image.convert_image_dtype([image tensor with dtype=tf.uint8], dtype=tf.float32)生成一个已归一化为 [0..1] 之间的值的图像张量

但我的错误是因为matplotlib.pyplot.imshow(image)不适用于由fordtype=tf.float32引起的负值。我发现将值转换回似乎可以解决我所有的问题mean_image_subtractionVgg16uint8imshow()

plt.imshow( np_image.astype(np.uint8) )

tensorflow - 在 CNN 的图像管道中使用 `tf.to_float()` 或 `tf.image.convert_image_dtype()`？

2 回答 2

Related

Reference