python - 在 Python 中对两个十六进制字符串进行异或运算 - 哪种方法是正确的？

Question

几天来，我一直在努力寻找一种方法来正确异或存储在字符串中的两个十六进制数字，并且我遇到了两种方法，这两种方法对我来说都很有意义，但会产生不同的结果。我对 Python 不是很精通（例如，我有 3 天的经验 :D），所以我不知道哪种方法是正确的。

方法一：

s1 = #hex number stored in a string 1
s2 = #hex number stored in a string 2

#Decoding the hex strings into ASCII symbols
s3 = s1.decode('hex')
s4 = s2.decode('hex')

#strxor - see the next code segment for the code of this function
xor1 = strxor(s3, s4)

#Encode the result back into ASCII
xor2 = xor1.encode('hex')

strxor 函数：

#This was given in my assignment and I am not entirely sure what is going on in
#here. I've been told that it takes two ASCII strings as input, converts them to
#numbers, XORs the numbers and converts the result back to ASCII again.

def strxor(a, b):     
    if len(a) > len(b):
        return "".join([chr(ord(x) ^ ord(y)) for (x, y) in zip(a[:len(b)], b)])
    else:
        return "".join([chr(ord(x) ^ ord(y)) for (x, y) in zip(a, b[:len(a)])])

方法二：

s1 = #ciphertext 1 - hex number in a string
s2 = #ciphertext 2 - hex number in a string

#convert the string to integers, xor them
#and convert back to hex
xor = hex(int(s1, 16) ^ int(s2, 16))

正如我之前所说，对于我有限的大脑来说，这两种解决方案似乎相同，但它们产生的结果却完全不同。问题是什么？我的系统上有 Python 2.7.3 和 3.3.2，我都试过了（虽然方法 1 没有，因为 python 3 不再有字符串的解码功能）

score 3 · Accepted Answer

your_string.encode('hex')将使用十六进制将的每个字符替换your_string为其 ASCII 值。

例如，知道 'A' 字母在 ASCII 中是 0x41：

>>> 'AAAA'.encode('hex')
'41414141'

你可以使用其他方式decode：

>>> '41414141'.decode('hex')
'AAAA'

但这不是你真正想要的。您想要的是将 0x12 转换为 18 (16 + 2)。为此，正确的方法是将int(your_string, 16)your_string 解释为以 16 为基数编码的数字。

所以，正确的解决方案是最后一个。

xor = hex(int(s1, 16) ^ int(s2, 16))

s1并且s2是包含数字的十六进制表示的字符串，您将它们解码为int告诉 Python 它是基数 16。然后您执行 xor，最后您使用十六进制表示（使用）将其转换回字符串hex。

score 1 · Accepted Answer

第一种方法的直接问题是您正在strxor申请s1and s2：

xor1 = strxor(s1, s2)

而您可能的意思是s3和s4：

xor1 = strxor(s3, s4)

通过这种更改，我从两种方法中得到了相同的结果（在一个简单的测试用例中）。

score 0 · Accepted Answer

...这两种解决方案看起来相同，但它们产生完全不同的结果。问题是什么？

对我来说，结果是一样的：

def strxor(a, b):
    len_ = min(len(a), len(b))
    return "".join([chr(ord(x) ^ ord(y)) for (x, y) in zip(a[:len_], b[:len_])])


def work(s1, s2):
    #strxor - see the next code segment for the code of this function
    xor1 = strxor(s1.decode('hex'), s2.decode('hex')).encode('hex')

    #convert the string to integers, xor them
    #and convert back to hex
    xor2 = hex(int(s1, 16) ^ int(s2, 16))[2:]

    print xor1
    print xor2

work('A0', '0A')
work('A0', 'A0')
work('00', 'AA')
work('00A0', 'A000')

给出：

aa
aa
00
0
aa
aa
a0a0
a0a0

score 0 · Accepted Answer

如果您定义您知道答案的测试用例，这将有所帮助。例如：

0x0f0f0f ^ 0xf0f0f0  -> 0xffffff
0x101010 ^ 0x000000  -> 0x101010

等等。您的“方法 2”是有效且正确的，并且在 Python 2 和 3 中是合法的（但您应该确保您的测试用例确认这一点）。

如您的测试所示，该功能strxor有缺陷。它需要两个输入字符串，将每个字符串中的相应字符转换为其ord最终表示，将它们异或在一起，将其转换回 ASCIIchr并再次将整个混乱连接在一起。需要一个测试用例来证明它可能适用于十进制数字，但会在混合大小写的十六进制上运行：

strxor('b', 'B')

不应该屈服#。

方法 2 是迄今为止最干净的，当并且已经存在时，使用str.decode可以被认为是编解码器滥用。讲师可能对列表推导更感兴趣，但是可以选择更好的示例。hexint

python - 在 Python 中对两个十六进制字符串进行异或运算 - 哪种方法是正确的？

4 回答 4

Related

Reference