python - 使用python和tensorflow从图像中识别数字

Question

详细信息：Ubuntu 14.04(LTS)、OpenCV 2.4.13、Spyder 2.3.9(Python 2.7)、Tensorflow r0.10

我想用Python和Tensorflow（可选OpenCV）从图像中识别 Number 。

此外，我想将 MNIST 数据训练与 tensorflow 一起使用

像这样（代码参考本页视频），

代码：

import tensorflow as tf
import random

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder("float", [None, 784])
y = tf.placeholder("float", [None, 10])

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_step = 1

### modeling ###

activation = tf.nn.softmax(tf.matmul(x, W) + b)

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(activation), reduction_indices=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)

init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

### training ###

for epoch in range(training_epochs) :

    avg_cost = 0
    total_batch = int(mnist.train.num_examples/batch_size)

    for i in range(total_batch) :

        batch_xs, batch_ys =mnist.train.next_batch(batch_size)
        sess.run(optimizer, feed_dict={x: batch_xs, y: batch_ys})
        avg_cost += sess.run(cross_entropy, feed_dict = {x: batch_xs, y: batch_ys}) / total_batch

    if epoch % display_step == 0 :
        print "Epoch : ", "%04d" % (epoch+1), "cost=", "{:.9f}".format(avg_cost)

print "Optimization Finished"

### predict number ###

r = random.randint(0, mnist.test.num_examples - 1)
print "Prediction: ", sess.run(tf.argmax(activation,1), {x: mnist.test.images[r:r+1]})
print "Correct Answer: ", sess.run(tf.argmax(mnist.test.labels[r:r+1], 1))

但是，问题是我怎样才能使numpy数组像

代码补充：

mnist.test.images[r:r+1]

[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0。0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.0. 0.50196081 0.50196081 0.50196081 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 1. 1. 1. 1. 0.50196081 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0 . 0.50196081 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.25098041 0. 0. 0. 0. 0. 0. 0. 0.0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 1. 0.50196081 0.50196081 0.50196081 0.74901962 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0.0.25098041 1. 1. 0.74901962 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0 . 0. 0.25098041 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0 . 0. 0. 0. 0.25098041 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 1. 0. 0. 0. 0. 0. 0. 0 . 0. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 1. 1. 0. 0. 0.0. 0. 0. 0. 0. 0.74901962 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 0.74901962 1. 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0. 0. 0.25098041 0.50196081 1. 1.1. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 1. 0.50196081 0.50196081 0.74901962 1. 1. 1. 1 . 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 1. 1. 1. 1. 1 . 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.0.25098041 1. 1. 1. 1. 1. 1. 1. 0.50196081 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0 . 0. 0. 0. 0.50196081 0.50196081 0.50196081 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0 . 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ]]0. 0. 0. 0. ]]0. 0. 0. 0. ]]

当我使用 OpenCV 解决问题时，我可以制作关于图像的 numpy 数组，但有点奇怪。（我想把数组变成28x28的向量）

代码补充：

image = cv2.imread("img_easy.jpg")
resized_image = cv2.resize(image, (28, 28))

[[[255 255 255] [255 255 255] [255 255 255] ...，[255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]

...,

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]]

然后，我将 value('resized_image') 放入 Tensorflow 代码中。像这样，

代码修改：

### predict number ###

print "Prediction: ", sess.run(tf.argmax(activation,1), {x: resized_image})
print "Correct Answer: 9"

结果，该行出现错误。

ValueError：无法为形状为“（？，784）”的张量u'Placeholder_2：0'提供形状（28、28、3）的值

最后，

1）我想知道如何制作可以输入张量流代码的数据（可能是numpy数组[784]）

2）你知道使用tensorflow的数字识别例子吗？

我是机器学习的初学者。

请详细告诉我该怎么做。

score 2 · Accepted Answer

您使用的图像似乎是 RGB，因此是第 3 维（28、28、3）。

原始 MNIST 图像是灰度的，宽度和高度为 28。这就是为什么 x 占位符的形状是 [None, 784] 因为 28*28= 784。

CV2 正在以 RGB 格式读取图像，并且您希望它是灰度的，即 (28,28) 在进行 imread 时，您可能会发现使用它很有帮助。

image = cv2.imread("img_easy.jpg", cv2.CV_LOAD_IMAGE_GRAYSCALE)

通过这样做，您的图像应该具有正确的形状 (28, 28)。

此外，CV2 图像值与您的问题中显示的 MNIST 图像不在同一范围内。您可能必须对图像中的值进行归一化，以使它们在 0-1 范围内。

此外，您可能希望为此使用 CNN（稍微高级但应该提供更好的结果）。有关更多详细信息，请参阅此页面上的教程https://www.tensorflow.org/tutorials/。

score 1 · Accepted Answer

你试过这个吗？我有同样的问题，这非常有帮助

resized = cv2.resize(image, dsize = (28,28), interpolation = cv2.INTER_CUBIC)

python - 使用python和tensorflow从图像中识别数字

2 回答 2

Related

Reference