python - 我如何用西里尔文解码类似 JSON 的字符串？

Question

我正在尝试在 Scrapy 中创建一个简单的蜘蛛，它将从网站获取所有广告。问题是所有广告都是西里尔文所以我得到这样的字符串：

1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430

这是蜘蛛的代码：

def parse_advert(self, response):
    x = HtmlXPathSelector(response)

    advert = AdvertItem()

    advert['title'] = x.select("//h1/text()").extract()
    advert['phone'] = "111111111111"
    advert['text'] = "text text text text text text"
    filename = response.url.split("/")[-2]
    open(filename, 'wb').write(str(advert['title']))

有没有办法即时“翻译”那个字符串？

谢谢。

score 1 · Accepted Answer

使用str.decode('unicode-escape')：

>>> print r'1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430'
1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430
>>> print r'1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430'.decode('unicode-escape')
1-комнатная квартира

score 0 · Accepted Answer

0

只需添加到文件 'setting.py' 行：

FEED_EXPORT_ENCODING = 'utf-8'

于 2019-06-03T12:22:11.353 回答

python - 我如何用西里尔文解码类似 JSON 的字符串？

2 回答 2

Related

Reference