2 回答
You want to decode
(not encode
) to get a unicode string from a byte string.
>>> s = '\xd0\xbc\xd0\xb0\xd1\x80\xd0\xba\xd0\xb0'
>>> us = s.decode('utf-8')
>>> print us
марка
Note that you may not be able to print
it because it contains characters outside ASCII. But you should be able to see its value in a Unicode-aware debugger. I ran the above in IDLE.
Update
It seems what you actually have is this:
>>> s = u'\xd0\xbc\xd0\xb0\xd1\x80\xd0\xba\xd0\xb0'
This is trickier because you first have to get those bytes into a bytestring before you call decode
. I'm not sure what the "best" way to do that is, but this works:
>>> us = ''.join(chr(ord(c)) for c in s).decode('utf-8')
>>> print us
марка
Note that you should of course be decoding it before you store it in the database as a string.
Mark is right: you need to decode the string. Byte strings become Unicode strings by decoding them, encoding goes the other way. This and many other details are at Pragmatic Unicode, or, How Do I Stop The Pain?.