According to the Python 2.7.x documentation:
The
unicode()
constructor has the signatureunicode(string[, encoding, errors])
. All of its arguments should be 8-bit strings. The first argument is converted to Unicode using the specified encoding; if you leave off the encoding argument, the ASCII encoding is used for the conversion, so characters greater than 127 will be treated as errors:>>> unicode('abcdef' + chr(255)) Traceback (most recent call last): ... UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 6: ordinal not in range(128)
So why does this work with Japanese characters in it?:
TestStr = "サーバ移設"
print TestStr
サーバ移設
and why does this work too?:
TestStr = unicode("サーバ移設")
print TestStr
サーバ移設
I would have expected an unicode decode error since the Japanese characters are not within the 8bit string range.