I receive a server response, bytes:
\xd0\xa0\xd1\x83\xd0\xb1\xd0\xbb\xd0\xb8 \xd0\xa0\xd0\xa4 \xd0\x9a\xd0\xa6\xd0\x91
This is for sure Cyrillic, but I'm not sure which encoding. Every attempt to decode it in Python fails:
b = b'\xd0\xa0\xd1\x83\xd0\xb1\xd0\xbb\xd0\xb8 \xd0\xa0\xd0\xa4 \xd0\x9a\xd0\xa6\xd0\x91'
>>> b.decode('utf-8')
'\u0420\u0443\u0431\u043b\u0438 \u0420\u0424 \u041a\u0426\u0411'
>>> print(b.decode('utf-8'))
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-4:
character maps to <undefined>
>>> b.decode('cp1251')
'\u0420\xa0\u0421\u0453\u0420±\u0420»\u0420\u0451 \u0420\xa0\u0420¤
\u0420\u0459\u0420¦\u0420\u2018'
>>> print(b.decode('cp1251'))
UnicodeEncodeError: 'charmap' codec can't encode character '\u0420' in
position 0: character maps to <undefined>
Both results somewhat resemble Unicode-escape, but this does not work either:
>>> codecs.decode('\u0420\u0443\u0431\u043b\u0438 \u0420\u0424 \u041a\u0426\u0411',
'unicode-escape')
'Ð\xa0Ñ\x83бли Ð\xa0Ф Ð\x9aЦÐ\x91'
There's a web service for recovering Cyrillic texts, it is able to decode my bytes using Windows-1251:
Output (source encoding : WINDOWS-1251)
Рубли РФ КЦБ
But I don't have any more ideas as for how to approach it.
I think I'm missing something about how encoding works, so if the problem seems trivial to you, I would greatly appreciate a bit of explanation/a link to a tutorial/ some keywords for further googling.
Solution:
Windows PowerShell uses Windows-850 codepage by default, which is incapable of handling some Cyrillic characters. One fix is to change the codepage to Unicode every time starting the shell:
chcp 65001
Here is explained how to make it the new default