python - 在 Python 中将二进制文件转换为 ascii

Question

我有一堆二进制文件，其中包含以下格式的数据：

i\xffhh\xffhh\xffhh\xffih\xffhh\xffhh\xffhh\xffhh\xffhi\xffii\xffjj\xffjj\xffjj\xffjk\xffkk\xffkk\xffkl\xffll\xffmm\xffmn\xffnn\xffon\xffno\xffop\xffop\xffpp\xffqq\xffrq\xffrs\xffst\xfftt\xfftt\xffuv\xffvu\xffuv\xffvv\xffvw\xffwx\xffwx\xffxy\xffyy\xffyz\xffz{\xffz{\xff||\xff}|\xff~}\xff}}\xff~~\xff~~\xff~\x7f\xff\x7f\x7f\xff\x7f\x7f\xff\x7f\x7f\xff\x80\x80\xff\x80\x81\xff\x81\x80\xff\x81\x81\xff\x81\x82\xff\x82\x82\xff\x82\x82\xff\x82\x83\xff\x83\x83\xff\x83\x83\xff\x83\x84\xff\x83\x84\xff\x84\x85\xff\x85\x85\xff\x86\x85\xff\x86\x87\xff\x87\x87\xff\x87\x87\xff\x88\x87\xff\x88\x89\xff\x88\x89\xff\x89\x8a\xff\x89\x8a\xff\x8a\x8b\xff\x8b\x8b\xff\x8b\x8c\xff\x8d\x8d\xff\x8d\x8d\xff\x8e\x8e\xff\x8e\x8f\xff\x8f\x8f

这些应该是一个人走路时的压力传感器读数，所以我假设它们是数字，但我想将它们转换为 ascii，所以我知道它们是什么。我如何转换它们？它们目前采用什么格式？

编辑：此处提供的文件链接（链接）

score 4 · Accepted Answer

您无法仅通过打开二进制文件来猜测格式。您必须获取有关特定压力传感器读数的数据存储方式的信息。

当然，当你知道格式时，很容易以二进制模式读取文件，然后从中获取所有有意义的数据

FILE = open(filename,"rb")
FILE.read(numBytes)

score 4 · Accepted Answer

我非常震惊和震惊，对所有的华夫饼都没有一点惊讶，比如“你有像 hh 这样不应该是十六进制数字的一部分的字母”和“它们似乎从第一个 \x7f 开始有意义” . 没有人看到任何repr()输出吗？

下面显示了它是如何以这样的方式结束的，忽略了\xff这似乎只是噪音：

>>> pressure = [120,121,122,123,124,125,126,127,128,129,130,131]
>>> import struct
>>> some_bytes = struct.pack("12B", *pressure)
>>> print repr(some_bytes)
'xyz{|}~\x7f\x80\x81\x82\x83'
>>>

因此，让我们尝试从文件中恢复：

>>> guff = open('your_file.bin', 'rb').read()
>>> cleaned = guff.replace("\xff", "")
>>> cleaned
'ihhhhhhihhhhhhhhhhiiijjjjjjjkkkkkklllmmmnnnonnoopopppqqrqrsstttttuvvuuvvvvwwxwx
xyyyyzz{z{||}|~}}}~~~~~\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x80\x80\x80\x81\x81\x80\x81\
x81\x81\x82\x82\x82\x82\x82\x82\x83\x83\x83\x83\x83\x83\x84\x83\x84\x84\x85\x85\
x85\x86\x85\x86\x87\x87\x87\x87\x87\x88\x87\x88\x89\x88\x89\x89\x8a\x89\x8a\x8a\
x8b\x8b\x8b\x8b\x8c\x8d\x8d\x8d\x8d\x8e\x8e\x8e\x8f\x8f\x8f'
# Note that lines wrap at column 80 in a Windows "Command Prompt" window ...
>>> pressure = [ord(c) for c in cleaned]
>>> pressure
[105, 104, 104, 104, 104, 104, 104, 105, 104, 104, 104, 104, 104, 104, 104, 104,
 104, 104, 105, 105, 105, 106, 106, 106, 106, 106, 106, 106, 107, 107, 107, 107,
 107, 107, 108, 108, 108, 109, 109, 109, 110, 110, 110, 111, 110, 110, 111, 111,
 112, 111, 112, 112, 112, 113, 113, 114, 113, 114, 115, 115, 116, 116, 116, 116,
 116, 117, 118, 118, 117, 117, 118, 118, 118, 118, 119, 119, 120, 119, 120, 120,
 121, 121, 121, 121, 122, 122, 123, 122, 123, 124, 124, 125, 124, 126, 125, 125,
 125, 126, 126, 126, 126, 126, 127, 127, 127, 127, 127, 127, 127, 128, 128, 128,
 129, 129, 128, 129, 129, 129, 130, 130, 130, 130, 130, 130, 131, 131, 131, 131,
 131, 131, 132, 131, 132, 132, 133, 133, 133, 134, 133, 134, 135, 135, 135, 135,
 135, 136, 135, 136, 137, 136, 137, 137, 138, 137, 138, 138, 139, 139, 139, 139,
 140, 141, 141, 141, 141, 142, 142, 142, 143, 143, 143]
>>>

您仍然需要阅读设备的文档，以了解将这些 0-254 值乘以的比例因子是多少。

您会注意到派生数字每次都会更改 +1、0 或 -1。这很适合假设每次读取只有 1 个字节，而不是每次读取两个或更多字节。

另一个想法：可能\xff是一个开始或结束标记，并且每个周期都会报告两个值（开始，停止）或（传感器-A，传感器-B）。

score 0 · Accepted Answer

第一部分看起来很奇怪。通常，像 \x8e 这样的数字只是十六进制字节的代码，除了在第一部分你有像 hh 这样的字母不应该是十六进制数字的一部分。

但是对于第二部分，您可以执行以下操作：

hex_list = r"\x7f\xff\x7f\x7f\xff\x7f\x7f\xff\x7f\x7f\xff\x80\x80\xff\x80\x81\xff\x81\x80\xff\x81\x81\xff\x81\x82\xff\x82\x82\xff\x82\x82\xff\x82\x83\xff\x83\x83\xff\x83\x83\xff\x83\x84\xff\x83\x84\xff\x84\x85\xff\x85\x85\xff\x86\x85\xff\x86\x87\xff\x87\x87\xff\x87\x87\xff\x88\x87\xff\x88\x89\xff\x88\x89\xff\x89\x8a\xff\x89\x8a\xff\x8a\x8b\xff\x8b\x8b\xff\x8b\x8c\xff\x8d\x8d\xff\x8d\x8d\xff\x8e\x8e\xff\x8e\x8f\xff\x8f\x8f"
int_list =  [int(hex,16) for hex in hex_list.replace('\\', ';0').split(';') if hex != '']

请注意，您总是得到一个介于 127 和 143 之间的数字，除了 255（\xff）。

python - 在 Python 中将二进制文件转换为 ascii

3 回答 3

Related

Reference