c++ - 使用 Pycrypto 的 Python 中的重复 Windows 加密服务提供程序结果

Question

编辑和更新

2013 年 3 月 24 日：
在转换为 utf-16 并在点击任何“e”或“m”字节之前停止后，我的 Python 输出哈希现在与来自 c++ 的哈希匹配。但是解密的结果不匹配。我知道我的 SHA1 哈希是 20 字节 = 160 位，并且 RC4 密钥的长度可以从 40 位到 2048 位不等，所以在 WinCrypt 中可能会有一些我需要模仿的默认加盐。CryptGetKeyParam KP_LENGTH 或 KP_SALT

2013 年 3 月24 日：
CryptGetKeyParam KP_LENGTH 告诉我我的密钥长度是 128 位。我给它一个 160 位的哈希值。所以也许它只是丢弃了最后 32 位……或 4 个字节。现在测试。

2013 年 3 月 24 日：是的，就是这样。如果我在 python 中丢弃我的 SHA1 哈希的最后 4 个字节......我会得到相同的解密结果。

快速信息：

我有一个 C++ 程序来解密一个数据块。它使用 Windows Crytographic Service Provider，因此它仅适用于 Windows。我希望它可以与其他平台一起使用。

方法概述：

在 Windows Crypto API 中，ASCII 编码的字节密码被转换为宽字符表示，然后使用 SHA1 进行散列以生成 RC4 流密码的密钥。

在 Python PyCrypto 中，一个 ASCII 编码的字节字符串被解码为一个 Python 字符串。它根据经验观察到的字节被截断，这导致 mbctowcs 停止在 c++ 中转换。然后，这个被截断的字符串在 utf-16 中进行编码，有效地在字符之间用 0x00 字节填充它。这个新的截断、填充的字节字符串被传递给 SHA1 哈希，摘要的前 128 位被传递给 PyCrypto RC4 对象。

问题 [已解决]
我似乎无法使用 Python 3.xw/PyCrypto 获得相同的结果

C++ 代码骨架：

HCRYPTPROV hProv      = 0x00;
HCRYPTHASH hHash      = 0x00;
HCRYPTKEY  hKey       = 0x00;
wchar_t    sBuf[256]  = {0};

CryptAcquireContextW(&hProv, L"FileContainer", L"Microsoft Enhanced RSA and AES Cryptographic Provider", 0x18u, 0);

CryptCreateHash(hProv, 0x8004u, 0, 0, &hHash);
//0x8004u is SHA1 flag

int len = mbstowcs(sBuf, iRec->desc, sizeof(sBuf));
//iRec is my "Record" class
//iRec->desc is 33 bytes within header of my encrypted file
//this will be used to create the hash key. (So this is the password)

CryptHashData(hHash, (const BYTE*)sBuf, len, 0);

CryptDeriveKey(hProv, 0x6801, hHash, 0, &hKey);

DWORD dataLen = iRec->compLen;  
//iRec->compLen is the length of encrypted datablock
//it's also compressed that's why it's called compLen

CryptDecrypt(hKey, 0, 0, 0, (BYTE*)iRec->decrypt, &dataLen);
// iRec is my record that i'm decrypting
// iRec->decrypt is where I store the decrypted data
//&dataLen is how long the encrypted data block is.
//I get this from file header info

Python代码骨架：

from Crypto.Cipher import ARC4
from Crypto.Hash import SHA

#this is the Decipher method from my record class
def Decipher(self):

    #get string representation of 33byte password
    key_string= self.desc.decode('ASCII')

    #so far, these characters fail, possibly others but
    #for now I will make it a list
    stop_chars = ['e','m']

    #slice off anything beyond where mbstowcs will stop
    for char in stop_chars:
        wc_stop = key_string.find(char)
        if wc_stop != -1:
            #slice operation
            key_string = key_string[:wc_stop]

    #make "wide character"
    #this is equivalent to padding bytes with 0x00

    #Slice off the two byte "Byte Order Mark" 0xff 0xfe 
    wc_byte_string = key_string.encode('utf-16')[2:]

    #slice off the trailing 0x00
    wc_byte_string = wc_byte_string[:len(wc_byte_string)-1] 

    #hash the "wchar" byte string
    #this is the equivalent to sBuf in c++ code above
    #as determined by writing sBuf to file in tests
    my_key = SHA.new(wc_byte_string).digest()

    #create a PyCrypto cipher object
    RC4_Cipher = ARC4.new(my_key[:16])

    #store the decrypted data..these results NOW MATCH
    self.decrypt = RC4_Cipher.decrypt(self.datablock)

怀疑 [编辑：已确认] 原因
1. mbstowcs 密码转换导致输入 SHA1 哈希的“原始数据”在 python 和 c++ 中不一样。mbstowcs 在 0x65 和 0x6D 字节处停止转换。原始数据以仅原始 33 字节密码的一部分的 wide_char 编码结束。

RC4 可以有可变长度的密钥。在增强型 Win Crypt 服务提供程序中，默认长度为 128 位。未指定密钥长度是采用“原始数据”的 160 位 SHA1 摘要的前 128 位

我如何调查 编辑：根据我自己的实验和@RolandSmith 的建议，我现在知道我的一个问题是 mbctowcs 的行为方式出乎我的意料。它似乎停止在“e”（0x65）和“m”（0x6d）（可能是其他）上写入 sBuf。因此，我的描述（Ascii 编码字节）中的密码“Monkey”在 sBuf 中看起来像“M on k”，因为 mbstowcs 在 e 处停止，并根据我系统上的 2 字节 wchar typedef 在字节之间放置 0x00。我通过将转换结果写入文本文件发现了这一点。

BYTE pbHash[256];  //buffer we will store the hash digest in 
DWORD dwHashLen;  //store the length of the hash
DWORD dwCount;
dwCount = sizeof(DWORD);  //how big is a dword on this system?


//see above "len" is the return value from mbstowcs that tells how
//many multibyte characters were converted from the original
//iRec->desc an placed into sBuf.  In some cases it's 3, 7, 9
//and always seems to stop on "e" or "m"

fstream outFile4("C:/desc_mbstowcs.txt", ios::out | ios::trunc | ios::binary);
outFile4.write((const CHAR*)sBuf, int(len));
outFile4.close();

//now get the hash size from CryptGetHashParam
//an get the acutal hash from the hash object hHash
//write it to a file.
if(CryptGetHashParam(hHash, HP_HASHSIZE, (BYTE *)&dwHashLen, &dwCount, 0)) {
  if(CryptGetHashParam(hHash, 0x0002, pbHash, &dwHashLen,0)){

    fstream outFile3("C:/test_hash.txt", ios::out | ios::trunc | ios::binary);
    outFile3.write((const CHAR*)pbHash, int(dwHashLen));
    outFile3.close();
  }
}

参考：
根据环境定义，宽字符会导致问题
VC++ 6.0 和 VS 2008 之间 Windows 加密服务的差异

将 utf-8 转换为 utf-16 字符串
Python - 将宽字符字符串从二进制文件转换为 Python unicode 字符串

PyCrypto RC4 示例
https://www.dlitz.net/software/pycrypto/api/current/Crypto.Cipher.ARC4-module.html

使用 Sha256 散列字符串

http://msdn.microsoft.com/en-us/library/windows/desktop/aa379916(v=vs.85).aspx

http://msdn.microsoft.com/en-us/library/windows/desktop/aa375599(v=vs.85).aspx

score 1 · Accepted Answer

您可以wchar_t使用小型测试程序（在 C 中）测试的大小：

#include <stdio.h> /* for printf */
#include <stddef.h> /* for wchar_t */

int main(int argc, char *argv[]) {
    printf("The size of wchar_t is %ld bytes.\n", sizeof(wchar_t));
    return 0;
}

如果您可以从终端运行 C++ 程序，您还可以在 C++ 代码中使用printf()调用将例如iRec->desc和散列的结果写入sbuf屏幕。否则使用fprintf()将它们转储到文件中。

为了更好地模仿 C++ 程序的行为，您甚至可以使用ctypes来调用mbstowcs()您的 Python 代码。

编辑：您写道：

mbctowcs 肯定是一个问题。似乎它正在将不可预测的（对我而言）字节数传输到我的缓冲区中进行散列。

请记住，它mbctowcs返回转换后的宽字符数。换句话说，多字节编码中的 33 字节缓冲区可以包含从 5 个（UTF-8 6 字节序列）到 33 个字符的任何内容，具体取决于所使用的编码。

Edit2：您使用 0dwFlags作为CryptDeriveKey. 根据其文档，高 16 位应包含密钥长度。您应该检查CryptDeriveKey的返回值以查看调用是否成功。

Edit3：您可以mbctowcs在 Python 中进行测试（我在这里使用IPython。）：

In [1]: from ctypes import *

In [2]: libc = CDLL('libc.so.7')

In [3]: monkey = c_char_p(u'Monkey')

In [4]: test = c_char_p(u'This is a test')

In [5]: wo = create_unicode_buffer(256)

In [6]: nref = c_size_t(250)

In [7]: libc.mbstowcs(wo, monkey, nref)
Out[7]: 6

In [8]: print wo.value
Monkey

In [9]: libc.mbstowcs(wo, test, nref)
Out[9]: 14

In [10]: print wo.value
This is a test

请注意，在 Windows 中，您可能应该使用libc = cdll.msvcrt而不是libc = CDLL('libc.so.7').

c++ - 使用 Pycrypto 的 Python 中的重复 Windows 加密服务提供程序结果

编辑和更新

快速信息：

方法概述：

1 回答 1

Related

Reference