python - 将JPG转换为txt会导致python中文件大小的变化

Question

我有一组保存为 .jpg 格式的图像。我在 python 上使用以下命令来加载它们并将它们以逗号分隔值格式存储在 txt 文件中。
原始图像集的大小只有 800 MB。但是，当我将它们保存在 txt 文件中时，它们会形成一个 40 GB 的 txt 文档。

我想知道这是否有意义？

for filename in os.listdir(imagePath):
    if filename!='.DS_Store':
        b= scipy.misc.imread(filename,flatten=0)
        b2=np.reshape(b,np.size(b))
        var = ','.join(['%d' % num for num in b2])
        with open(savepath+'trainMatrix.txt',"a") as f:
            f.write(var+'\n')
            f.close()

score 2 · Accepted Answer

您对图像文件的处理方式似乎存在误解。以下根据您的问题显示了两种可能的情况。

在不分析图像数据的情况下将文件读JPG入TXT文件，即不解压缩等。使用这个（这有什么用，我们不确定！，顺便说一句）。

import os
from scipy.misc import imread
import numpy as np

imagePath = 'c:/your jpgs/'
savepath = imagePath

#save as text no decompressing
for filename in os.listdir(imagePath):
    if filename!='.DS_Store' and filename[-3:]=='jpg':
        with open(filename,'rb') as fin:
            b = fin.read()
            fin.close()
        out = ','.join(b)+'\n'
        with open(savepath+'trainMatrix1.txt','a') as fut:
            fut.write(out)
            fut.close()

输出为：

ÿ,Ø,ÿ,à, ,,J,F,I,F, ,,,, ,d, ,d, , ,ÿ,á,

将文件读JPG入TXT文件并分析图像数据，即解压缩等。使用它imread来解压缩图像数据。您需要记住JPG的是一种高度压缩的图像格式，因此解压缩后将是巨大的文本文件。您正在追加所有内容，因此输出将是巨大的！

#save as text decompressed image into bytes
for filename in os.listdir(imagePath):
    if filename!='.DS_Store' and filename[-3:]=='jpg':
        b = imread(filename,flatten=0).flatten()
        print b.shape
        out = ','.join('%d'%i for i in b)+'\n'
        print len(out)
        with open(savepath+'trainMatrix2.txt','a') as fut:
            fut.write(out)
            fut.close()

输出为（颜色数据）：

255,255,255,245,245,245,125,125,125,72,72,72,17,17,17,2,2,2,15

python - 将JPG转换为txt会导致python中文件大小的变化

1 回答 1

Related

Reference