12

I have a string (it could be an integer too) in Python and I want to write it to a file. It contains only ones and zeros I want that pattern of ones and zeros to be written to a file. I want to write the binary directly because I need to store a lot of data, but only certain values. I see no need to take up the space of using eight bit per value when I only need three.

For instance. Let's say I were to write the binary string "01100010" to a file. If I opened it in a text editor it would say b (01100010 is the ascii code for b). Do not be confused though. I do not want to write ascii codes, the example was just to indicate that I want to directly write bytes to the file.


Clarification:

My string looks something like this:

binary_string = "001011010110000010010"

It is not made of of the binary codes for numbers or characters. It contains data relative only to my program.

4

5 回答 5

12

要写出一个字符串,您可以使用文件的.write方法。要写入整数,您需要使用struct模块

import struct

#...
with open('file.dat', 'wb') as f:
    if isinstance(value, int):
        f.write(struct.pack('i', value)) # write an int
    elif isinstance(value, str):
        f.write(value) # write a string
    else:
        raise TypeError('Can only write str or int')

但是 int 和 string 的表示是不同的,你可以用这个bin函数来把它变成一个由 0 和 1 组成的字符串

>>> bin(7)
'0b111'
>>> bin(7)[2:] #cut off the 0b
'111'

但也许处理所有这些ints 的最好方法是为文件中的二进制字符串确定一个固定的宽度,然后像这样转换它们:

>>> x = 7
>>> '{0:032b}'.format(x) #32 character wide binary number with '0' as filler
'00000000000000000000000000000111'
于 2013-06-02T21:52:03.717 回答
8

好吧,经过相当多的搜索,我找到了答案。我相信你们其他人根本不明白(这可能是我的错,因为我不得不编辑两次才能说清楚)。我在这里找到了。

答案是拆分每条数据,将它们转换为二进制整数,然后将它们放入二进制数组中。之后,您可以使用数组的tofile()方法写入文件。

from array import *

bin_array = array('B')

bin_array.append(int('011',2))
bin_array.append(int('010',2))
bin_array.append(int('110',2))

with file('binary.mydata', 'wb') as f:
    bin_array.tofile(f)
于 2013-06-02T23:36:22.053 回答
4

我希望将这种 1 和 0 的模式写入文件。

如果你的意思是你想将一个比特流从一个字符串写入一个文件,你需要这样的东西......

from cStringIO import StringIO

s = "001011010110000010010"
sio = StringIO(s)

f = open('outfile', 'wb')

while 1:
    # Grab the next 8 bits
    b = sio.read(8)

    # Bail if we hit EOF
    if not b:
        break

    # If we got fewer than 8 bits, pad with zeroes on the right
    if len(b) < 8:
        b = b + '0' * (8 - len(b))

    # Convert to int
    i = int(b, 2)

    # Convert to char
    c = chr(i)

    # Write
    f.write(c)

f.close()

...为此xxd -b outfile显示...

0000000: 00101101 01100000 10010000                             -`.
于 2013-06-02T22:16:50.107 回答
2

简要示例:

my_number = 1234
with open('myfile', 'wb') as file_handle:
    file_handle.write(struct.pack('i', my_number))
...
with open('myfile', 'rb') as file_handle:
    my_number_back = struct.unpack('i', file_handle.read())[0]
于 2016-09-20T05:04:02.927 回答
0

一次附加到array.array3 位仍然会为每个值生成 8 位。011将、010和附加110到数组并写入磁盘将产生以下输出:00000011 00000010 00000110. 注意那里所有的填充零。

相反,您似乎想将二进制三元组“压缩”成字节以节省空间。鉴于您问题中的示例字符串,您可以将其转换为整数列表(一次 8 位),然后直接将其写入文件。这会将所有位打包在一起,每个值仅使用 3 位而不是 8 位。

Python 3.4 示例

original_string = '001011010110000010010'

# first split into 8-bit chunks
bit_strings = [original_string[i:i + 8] for i in range(0, len(original_string), 8)]

# then convert to integers
byte_list = [int(b, 2) for b in bit_strings]

with open('byte.dat', 'wb') as f:
    f.write(bytearray(byte_list))  # convert to bytearray before writing

byte.dat 的内容:

  • 十六进制:2D 60 12
  • 二进制(8 位):00101101 01100000 00010010
  • 二进制(3位):001 011 010 110 000 000 010 010

                                        ^^ ^ (Note extra bits)
    

    请注意,此方法将填充最后一个值,使其与 8 位边界对齐,并且填充到最高有效位(上述输出中最后一个字节的左侧)。因此,您需要小心,并可能在原始字符串的末尾添加零,以使您的字符串长度成为 8 的倍数。

于 2016-03-30T22:25:30.687 回答