python - CBC 模式下带有 AES-256 的 HMAC-SHA256

Question

我最近遇到了以下代码示例，用于使用 AES-256 CBC 加密文件，并使用 SHA-256 HMAC 进行身份验证和验证：

aes_key, hmac_key = self.keys
# create a PKCS#7 pad to get us to `len(data) % 16 == 0`
pad_length = 16 - len(data) % 16
data = data + (pad_length * chr(pad_length))
# get IV
iv = os.urandom(16)
# create cipher
cipher = AES.new(aes_key, AES.MODE_CBC, iv)
data = iv + cipher.encrypt(data)
sig = hmac.new(hmac_key, data, hashlib.sha256).digest()
# return the encrypted data (iv, followed by encrypted data, followed by hmac sig):
return data + sig

因为，就我而言，我加密的不仅仅是一个字符串，而是一个相当大的文件，我修改了代码以执行以下操作：

aes_key, hmac_key = self.keys
iv = os.urandom(16)
cipher = AES.new(aes_key, AES.MODE_CBC, iv)

with open('input.file', 'rb') as infile:
    with open('output.file', 'wb') as outfile:
        # write the iv to the file:
        outfile.write(iv)

        # start the loop
        end_of_line = True

        while True:
            input_chunk = infile.read(64 * 1024)

            if len(input_chunk) == 0:
                # we have reached the end of the input file and it matches `% 16 == 0`
                # so pad it with 16 bytes of PKCS#7 padding:
                end_of_line = True
                input_chunk += 16 * chr(16)
            elif len(input_chunk) % 16 > 0:
                # we have reached the end of the input file and it doesn't match `% 16 == 0`
                # pad it by the remainder of bytes in PKCS#7:
                end_of_line = True
                input_chunk_remainder = 16 - (len(input_chunk) & 16)
                input_chunk += input_chunk_remainder * chr(input_chunk_remainder)

            # write out encrypted data and an HMAC of the block
            outfile.write(cipher.encrypt(input_chunk) + hmac.new(hmac_key, data, 
                    hashlib.sha256).digest())

            if end_of_line:
                break

简而言之，它一次读取 64KB 块中的输入文件并加密这些块，使用加密数据的 SHA-256 生成 HMAC，并在每个块之后附加该 HMAC。解密将通过读取 64KB + 32B 块并计算前 64KB 的 HMAC 并将其与占用块中最后 32 个字节的 SHA-256 和进行比较来进行。

这是使用 HMAC 的正确方法吗？它是否确保数据未经修改并使用正确的密钥解密的安全性和身份验证？

仅供参考，AES 和 HMAC 密钥都来自相同的密码短语，该密码短语是通过 SHA-512 运行输入文本，然后通过 bcrypt，然后再次通过 SHA-512 生成的。然后将最终 SHA-512 的输出分成两块，一个用于 AES 密码，另一个用于 HMAC。

score 6 · Accepted Answer

是的，有两个安全问题。

但首先，我假设最后有这个陈述：

# write out encrypted data and an HMAC of the block
outfile.write(cipher.encrypt(input_chunk) + hmac.new(hmac_key, data, hashlib.sha256).digest())

你实际上的意思是：

# write out encrypted data and an HMAC of the block
data = cipher.encrypt(input_chunk)
outfile.write(data + hmac.new(hmac_key, data, hashlib.sha256).digest())

因为data没有在任何地方定义。

第一个安全问题是您要独立于其他部分验证每个部分，而不是组成部分。换句话说，攻击者可以重新洗牌、复制或删除任何块，而接收者不会注意到。

一种更安全的方法是只拥有一个 HMAC 实例，通过该update方法将所有加密数据传递给它，并在最后输出一个摘要。

或者，如果您想让接收器在接收整个文件之前检测到篡改，您可以输出每个片段的中间 MAC。事实上，调用digest不会改变 HMAC 的状态；之后你可以继续打电话update。

第二个安全问题是你不使用盐来推导你的密钥（我这么说是因为你不发送它）。除了密码破解，如果您使用相同的密码加密超过 2 个文件，攻击者还可以自由混合任一加密文件获取的块 - 因为 HMAC 密钥是相同的。解决方法：用盐。

最后一件小事：infile.read(64 * 1024)可能返回少于64*1024字节，但这并不意味着您已到达文件末尾。

score -2 · Accepted Answer

我不认为您对 HMAC 所做的操作存在安全问题（这并不意味着安全性没有问题），但我不知道 HMAC 子元素的实际值密文让你。除非您希望在发生篡改时支持对明文进行部分恢复，否则没有太多理由会产生 HMACing 64 KB 块的开销，而不是完整的密文。

从密钥生成的角度来看，使用从密码短语生成的密钥来加密两个随机生成的密钥，然后使用随机生成的密钥执行 HMAC 和 AES 操作可能更有意义。我知道对分组密码和 HMAC 使用相同的密钥是个坏消息，但我不知道使用以相同方式生成的密钥是否同样不好。

至少，您应该调整您的密钥派生机制。bcrypt 是一种密码散列机制，而不是密钥派生函数。您应该使用 PBKDF2 进行密钥派生。

python - CBC 模式下带有 AES-256 的 HMAC-SHA256

2 回答 2

Related

Reference