0

In Python, I am trying to byte string to handle some 8 bit character string. I find that byte string is not necessary behavior in a string like way. With subscript, it returns an number instead of a byte string of length 1.

In [243]: s=b'hello'

In [244]: s[1]
Out[244]: 101

In [245]: s[1:2]
Out[245]: b'e'

This makes it really difficult when I iterate it. For example, this code works with string but fail for byte string.

In [260]: d = {b'e': b'E', b'h': b'H', b'l': b'L', b'o': b'O'}

In [261]: list(map(d.get, s))
Out[261]: [None, None, None, None, None]

This breaks some code from Python 2. I also find this irregularity really inconcenient. Anyone has any insight what's going on with byte string?

4

2 回答 2

0

字节字符串存储 0-255 范围内的字节值。字节只是为了repr方便查看它们,但它们存储的是数据而不是文本。观察:

>>> x=bytes([104,101,108,108,111])
>>> x
b'hello'
>>> x[0]
104
>>> x[1]
101
>>> list(x)
[104, 101, 108, 108, 111]

使用字符串作为文本。如果以字节开头,请适当解码:

>>> s=b'hello'.decode('ascii')
>>> d = dict(zip('hello','HELLO'))
>>> list(map(d.get,s))
['H', 'E', 'L', 'L', 'O']

但是如果你想使用字节:

>>> d=dict(zip(b'hello',b'HELLO'))
>>> d
{104: 72, 108: 76, 101: 69, 111: 79}
>>> list(map(d.get,b'hello'))
[72, 69, 76, 76, 79]
>>> bytes(map(d.get,b'hello'))
b'HELLO'
于 2013-10-08T06:22:44.240 回答
0

您可以简单地decode获取字符串,获取所需的元素并将其编码回来:

s=b'hello'
t = s.decode()
print(t[1])             # This gives a char object   
print(t[1].encode())    # This gives a byte object
于 2020-07-03T11:21:18.053 回答