python - python文字二进制到十六进制转换

Question

我有一个包含一系列位的文本文件，在 ascii 中：

cat myFile.txt
0101111011100011001...

我想以二进制模式将此写入另一个文件，以便我可以在十六进制编辑器中读取它。我怎么能做到这一点？我已经尝试使用以下代码对其进行转换：

f2=open(fileOut, 'wb')
    with open(fileIn) as f:
      while True:
            c = f.read(1)
            byte = byte+str(c)
            if not c:
                print "End of file"
                break
            if count % 8 is 0:
                count = 0 
                print hex(int(byte,2))
                try:
                    f2.write('\\x'+hex(int(byte,2))[2:]).zfill(2)
                except:
                     pass
                byte = ''
            count += 1

但这并没有达到我的计划。你有什么提示吗？

score 2 · Accepted Answer

一次读取和写入一个字节非常缓慢。f.read只需在每次调用and时从文件中读取更多数据，您就可以将代码加速约 45 倍f.write：
```
|------------------+--------------------|
| using_loop_20480 | 8.34 msec per loop | 
| using_loop_8     | 354 msec per loop  |
|------------------+--------------------|
```
using_loop是本文底部显示的代码。using_loop_20480是 chunksize = 1024*20 的代码。这意味着一次从文件中读取 20480 个字节。using_loop_1与 chunksize = 1 的代码相同。
关于count % 8 is 0：不要is用来比较数值；改为使用== 。以下是一些示例，为什么is可能会给您错误的结果（可能不在您发布的代码中，但总的来说，is这里不合适）：
```
In [5]: 1L is 1
Out[5]: False

In [6]: 1L == 1
Out[6]: True

In [7]: 0.0 is 0
Out[7]: False

In [8]: 0.0 == 0
Out[8]: True
```
代替
```
struct.pack('{n}B'.format(n = len(bytes)), *bytes)
```
你可以使用
```
bytearray(bytes)
```
它不仅打字少，而且速度也快了一点。
```
|------------------------------+--------------------|
|             using_loop_20480 | 8.34 msec per loop |
| using_loop_with_struct_20480 | 8.59 msec per loop |
|------------------------------+--------------------|
```
字节数组非常适合这项工作，因为它弥合了将数据视为字符串和数字序列之间的差距。
```
In [16]: bytearray([97,98,99])
Out[16]: bytearray(b'abc')

In [17]: print(bytearray([97,98,99]))
abc
```
正如您在上面看到的，bytearray(bytes)允许您通过传递一个整数序列 (in range(256)) 来定义字节数组，并允许您将其写出来，就好像它是一个字符串：g.write(bytearray(bytes))。

def using_loop(output, chunksize):
    with open(filename, 'r') as f, open(output, 'wb') as g:
        while True:
            chunk = f.read(chunksize)
            if chunk == '':
                break
            bytes = [int(chunk[i:i+8], 2)
                     for i in range(0, len(chunk), 8)]
            g.write(bytearray(bytes))

确保 chunksize 是 8 的倍数。

这是我用来创建表的代码。请注意，prettytable也做了类似的事情，建议使用他们的代码而不是我的 hack：table.py

这是我用来计时代码的模块：utils_timeit.py。（它使用 table.py）。

这是我用来计时的代码using_loop（和其他变体）：timeit_bytearray_vs_struct.py

score 1 · Accepted Answer

使用struct：

import struct
...
f2.write(struct.pack('b', int(byte,2))) # signed 8 bit int

或者

f2.write(struct.pack('B', int(byte,2))) # unsigned 8 bit int

python - python文字二进制到十六进制转换

2 回答 2

Related

Reference