415

我有这个错误:

Traceback (most recent call last):
  File "python_md5_cracker.py", line 27, in <module>
  m.update(line)
TypeError: Unicode-objects must be encoded before hashing

当我尝试在Python 3.2.2中执行此代码时:

import hashlib, sys
m = hashlib.md5()
hash = ""
hash_file = input("What is the file name in which the hash resides?  ")
wordlist = input("What is your wordlist?  (Enter the file name)  ")
try:
  hashdocument = open(hash_file, "r")
except IOError:
  print("Invalid file.")
  raw_input()
  sys.exit()
else:
  hash = hashdocument.readline()
  hash = hash.replace("\n", "")

try:
  wordlistfile = open(wordlist, "r")
except IOError:
  print("Invalid file.")
  raw_input()
  sys.exit()
else:
  pass
for line in wordlistfile:
  # Flush the buffer (this caused a massive problem when placed 
  # at the beginning of the script, because the buffer kept getting
  # overwritten, thus comparing incorrect hashes)
  m = hashlib.md5()
  line = line.replace("\n", "")
  m.update(line)
  word_hash = m.hexdigest()
  if word_hash == hash:
    print("Collision! The word corresponding to the given hash is", line)
    input()
    sys.exit()

print("The hash given does not correspond to any supplied word in the wordlist.")
input()
sys.exit()
4

10 回答 10

408

它可能正在寻找来自wordlistfile.

wordlistfile = open(wordlist,"r",encoding='utf-8')

或者,如果您正在逐行工作:

line.encode('utf-8')

编辑

根据下面的评论和这个答案

我上面的回答假设所需的输出是str来自wordlist文件的。如果您在 中工作得心应手bytes,那么您最好使用open(wordlist, "rb"). 但重要的是要记住,hashfile如果rb其与hexdigest. hashlib.md5(value).hashdigest()输出 astr并且不能直接与字节对象比较:'abc' != b'abc'。(这个话题还有很多,但我没有时间提款机)。

还应该注意的是,这一行:

line.replace("\n", "")

应该是

line.strip()

这对字节和str都有效。但是,如果您决定简单地转换为bytes,则可以将行更改为:

line.replace(b"\n", b"")
于 2011-09-28T15:10:20.107 回答
169

你必须定义encoding formatlike utf-8,试试这个简单的方法,

此示例使用 SHA256 算法生成一个随机数:

>>> import hashlib
>>> hashlib.sha256(str(random.getrandbits(256)).encode('utf-8')).hexdigest()
'cd183a211ed2434eac4f31b317c573c50e6c24e3a28b82ddcb0bf8bedf387a9f'
于 2014-03-19T12:03:59.453 回答
41
import hashlib
string_to_hash = '123'
hash_object = hashlib.sha256(str(string_to_hash).encode('utf-8'))
print('Hash', hash_object.hexdigest())
于 2018-12-16T14:15:18.730 回答
20

该错误已经说明了您必须做什么。MD5 对字节进行操作,因此您必须将 Unicode 字符串编码bytesline.encode('utf-8').

于 2011-09-28T15:09:17.083 回答
19

要存储密码 (PY3):

import hashlib, os
password_salt = os.urandom(32).hex()
password = '12345'

hash = hashlib.sha512()
hash.update(('%s%s' % (password_salt, password)).encode('utf-8'))
password_hash = hash.hexdigest()
于 2017-09-11T09:09:18.553 回答
15

编码这条线为我修复了它。

m.update(line.encode('utf-8'))
于 2019-01-29T00:38:20.800 回答
14

请先看看那个答案。

现在,错误消息很清楚:您只能使用字节,而不是 Python 字符串(以前unicode在 Python < 3 中使用的),因此您必须使用您喜欢的编码对字符串进行编码:utf-32utf-16utf-8甚至是受限的 8-位编码(有些人可能称之为代码页)。

当您从文件中读取时,您的 wordlist 文件中的字节会被 Python 3 自动解码为 Unicode。我建议你这样做:

m.update(line.encode(wordlistfile.encoding))

以便推送到 md5 算法的编码数据与底层文件完全一样编码。

于 2011-10-15T14:14:05.830 回答
11

您可以以二进制模式打开文件:

import hashlib

with open(hash_file) as file:
    control_hash = file.readline().rstrip("\n")

wordlistfile = open(wordlist, "rb")
# ...
for line in wordlistfile:
    if hashlib.md5(line.rstrip(b'\n\r')).hexdigest() == control_hash:
       # collision
于 2014-03-25T19:36:49.680 回答
5

如果是单行字符串。用 b 或 B 包裹它。例如:

variable = b"This is a variable"

或者

variable2 = B"This is also a variable"
于 2020-04-05T07:36:42.930 回答
-4

该程序是上述 MD5 破解程序的无错误和增强版本,它读取包含散列密码列表的文件,并将其与英语词典单词列表中的散列单词进行检查。希望它是有帮助的。

我从以下链接下载了英语词典 https://github.com/dwyl/english-words

# md5cracker.py
# English Dictionary https://github.com/dwyl/english-words 

import hashlib, sys

hash_file = 'exercise\hashed.txt'
wordlist = 'data_sets\english_dictionary\words.txt'

try:
    hashdocument = open(hash_file,'r')
except IOError:
    print('Invalid file.')
    sys.exit()
else:
    count = 0
    for hash in hashdocument:
        hash = hash.rstrip('\n')
        print(hash)
        i = 0
        with open(wordlist,'r') as wordlistfile:
            for word in wordlistfile:
                m = hashlib.md5()
                word = word.rstrip('\n')            
                m.update(word.encode('utf-8'))
                word_hash = m.hexdigest()
                if word_hash==hash:
                    print('The word, hash combination is ' + word + ',' + hash)
                    count += 1
                    break
                i += 1
        print('Itiration is ' + str(i))
    if count == 0:
        print('The hash given does not correspond to any supplied word in the wordlist.')
    else:
        print('Total passwords identified is: ' + str(count))
sys.exit()
于 2018-06-23T18:01:16.477 回答