python - 用随机二进制密钥解密二进制序列。我的脚本有什么问题？

Question

（以下更新）

从Philip Klein的线性代数文本Coding the Matrix中尝试一个问题。针对所有可能的密钥强制使用密文二进制序列时遇到问题。问题 1.5.1，第 57 页：

一个 11 符号的消息已被加密如下。每个符号由 0 到 26 之间的数字表示（A 映射到 0，B 到 25，空格到 26。）每个数字由 5 位二进制序列表示（0 映射到 00000，26 到 11010）。最后，生成的 55 位序列使用有缺陷的一次性密码器版本进行加密：密钥不是 55 位，而是相同的 5 位随机位序列的 11 个副本。密文为： '10101'、'00100'、'10101'、'01011'、'11001'、'00011'、'01011'、'10101'、'00100'、'11001'、'11010'

目标是找出明文。我遇到的问题是：一，我的解码器函数产生了几个高于 int 26 的 5 位二进制文件。这个函数应该针对每个可能的 5 位二进制序列（密钥）尝试密文二进制序列，直到 int 26，产生每一个可能的明文序列。二，我应该使用伽罗瓦域来确保每个二进制序列保持二进制吗？（1 + 1 = 0 而不是 2）有什么建议吗？我正在尝试通过使用 Klein 的有趣文本来学习线性代数（并提高我有限的 Python 能力）。这是相当困难的......谢谢！

import string

# The Cipher-text
Y = ['10101', '00100', '10101', '01011', '11001', '00011', '01011', '10101', '00100', '11001', '11010']


def trans(bit): # convert the bits into int
    x = ''.join(map(str, bit))
    return int(x, 2)


def decoder(cipher): # Try the ciphertext against each possible 5 bit sequence (up to 11010)
    pos_seq = ["".join("{0:05b}".format(x)) for x in range(27)]
    attempt = []
    for i in pos_seq:
        for j in cipher: # Add ciphertext binary to every possible binary 'key'.
            temp = [(int(i[0])+int(j[0])),(int(i[1])+int(j[1])),(int(i[2])+int(j[2])),
                    (int(i[3])+int(j[3])), (int(i[4])+int(j[4]))]
            attempt.append(temp)
    for i in range(len(attempt)): # Galois Field, 'two' becomes 'one'
        for k in attempt[i]:
            if k == 2:
                attempt[i][attempt[i].index(k)] = 0
    return attempt


new_list = decoder(Y)

decoded = [trans(i) for i in new_list] # translate each 5 digit sequence into int

alph = list(string.ascii_lowercase) # alphabet plus a space
alph.append(' ')

decoded2 = [alph[x] for x in decoded] # Translate int to letter or space

print(decoded2)

更新

感谢曹大方和jason，我将代码编辑如下，发现明文：eve is evil

解码器功能范围增加到 32
（我仍然需要围绕 GF(2) 和 XOR，包括大方使用x ^ y & mask）
使用解码器将返回的列表切成 11 项列表

chunks = [decoded[x:x+11] for x in range(0, len(decoded), 11)]

*因为密文由 11 个“符号”组成

将上面列表映射到大方使用的 lambda 函数：

def decode(encoded):
    y = []
    for i in encoded:
        if i < 27:
            y.append(i)
    return ''.join(map(lambda x: alph[x], y))

for i in chunks:
    decoded2 = decode(i)
    print(decoded2)

score 3 · Accepted Answer

使用 5 位有 32 个可能的值。知道有效值是 0-26，任何高于 26 的解密值都表明密钥不是用于加密的密钥。暴力破解时，只要遇到这样的值，就可以跳过键。事实上，我认为这就是你应该如何消除不正确的键。

同时，也没有说key不大于26。你应该尝试所有32个组合。

回覆。伽罗瓦域，计算机处理器自然是二进制的。您可以利用 XOR 等按位运算来加快代码速度。实际上，XOR就是GF(2) 中的加法运算。您可以使用 GF(2) 在 GF(2) 中添加 2 个向量x ^ y & mask。当不处理 GF(2) 时，您可以在将“包装”值添加到有效范围后使用模运算符x + y % n。

import string

# The Cipher-text
Y = ['10101', '00100', '10101', '01011', '11001', '00011', '01011', '10101', '00100', '11001', '11010']
# Parse to binary value
y = list(map(lambda v: int(v, 2), Y))

max_val = 32
alphabet_size = 26

# decrypt([0b11111, ...]) = [(key=0b1111, decrpyted=[0b11111, ...]), ...]
def decrypt(encrypted):
    possible_keys = range(max_val)
    attempt = []
    for key in possible_keys:
        decrypted = []
        for symbol in encrypted:
            # XOR is equivalent to Add in GF(2)
            # If not GF(2), add then wrap around using modulo
            v = symbol ^ key & (max_val - 1)
            print('v = %d' % v)
            if v > alphabet_size:
                break
            decrypted.append(v)
        if len(decrypted) < len(encrypted):
            print('bad key %d' % key)
            continue
        print('good key %d' % key)
        attempt.append((key, decrypted))
    return attempt

# alphabet plus a space
alph = list(string.ascii_lowercase)
alph.append(' ')
def decode(encoded):
    return ''.join(map(lambda x: alph[x], encoded))

results = decrypt(y)

for (key, encoded) in results:
    decoded = decode(encoded)
    print(decoded)

score 1 · Accepted Answer

替代方法：

# First establish the letter -> binary code:
digits = set(range(2))
base = 2
alphabet = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S',
'T', 'U', 'V', 'W', 'X', 'Y', 'Z', ' ']
binary_code = {a*base**4 + b*base**3 + c*base**2 + d*base**1 + e*base**0:str(a)+str(b)+str(c)+str(d)+str(e) for a in digits
                          for b in digits
                          for c in digits
                          for d in digits
                          for e in digits if a*base**4 + b*base**3 + c*base**2 + d*base**1 + e*base**0 <= 26}
letters_to_binary = {k:v for v,k in zip(alphabet,binary_code.values())}

# Now because this one-time pad is flawed, we can make a list of all possible keys - there are only 32
# These are all the 5 bit binary numbers
keys = {str(a)+str(b)+str(c)+str(d)+str(e) for a in digits for b in digits
                        for c in digits
                        for d in digits
                        for e in digits}

# We can then apply each key to the cyphertext under 
# Galois Field 2 addition - GF(2), and see which of 
# the resulting messages is most sensical

cyphertext = ['10101', '00100', '10101', '01011', '11001', '00011', '01011', '10101', '00100', '11001', '11010']
results = {}
for key in keys:
    message = ''
    for text in cyphertext:
        decoded = ''
        for num, t in zip(key, text):
            # The following if else implements GF(2) 
            # addition
            if (int(num) == 1 and int(t) == 0) or (int(num) == 0 and int(t) == 1):
                decoded = decoded + '1'
            else:
                decoded = decoded + '0'
        if decoded not in letters_to_binary:
            continue
        else:
            message = message + letters_to_binary[decoded]
    results[key] = message

print(results)
# From printing the results, using a key of 10001 results in a message of 'EVE IS EVIL', no other messages make sense.
# Thus the key is 10001

score 1 · Accepted Answer

派对迟到了，但我认为这个可能是直截了当的：

message = [0b10101, 0b00100, 0b10101, 0b01011, 0b11001, 0b00011, 
0b01011, 0b10101, 0b00100, 0b11001, 0b11010]

keys = list(range(0b00000, 0b100000))

letters = ['a', 'b', 'c', 'd', 'e', 
'f', 'g', 'h', 'i', 'j', 'k', 'l',
'm', 'n', 'o', 'p', 'q', 'r', 's', 
't', 'u', 'v', 'w', 'x', 'y', 'z', ' ']

for k in keys:
    decrypted = ""
    possible = True
    for m in message:
        if(m^k in range(len(letters))):
            decrypted += letters[m^k]
        else:
            possible = False
            break
    if(possible):
        print(str(k) + ": " + decrypted)

score 0 · Accepted Answer

看看这个解决方案，正好在代码 17 处。我认为它简短而简单。夏娃是邪恶的！

问候！

message = [0b10101,0b00100,0b10101,
           0b01011,0b11001,0b00011,
           0b01011,0b10101,0b00100,
           0b11001,0b11010]
           
codes = range(0b00000,0b100000)
L = {i-65:chr(i) for i in range(65,91)}
L[26] = ' '

for code in codes:
    print(code, bin(code), ''.join([L[(l^code)%27] for l in message]))

python - 用随机二进制密钥解密二进制序列。我的脚本有什么问题？

4 回答 4

Related

Reference