2

在这里我正在编码数据

post = """
='Brand New News Fr0m The Timber Industry!!'=

========Latest Profile==========
Energy & Asset Technology, Inc. (EGTY)
Current Price $0.15
================================

Recognize this undiscovered gem which is poised to jump!! 

Please read the following Announcement in its Entierty and 
Consider the Possibilities�
Watch this One to Trad,e!

Because, EGTY has secured the global rights to market 
genetically enhanced fast growing, hard-wood trees!

EGTY trading volume is beginning to surge with landslide Announcement. 
The value of this Stoc,k appears poised for growth! This one will not 
remain on the ground floor for long.

KEEP READING!!!!!!!!!!!!!!!

===============
"BREAKING NEWS"
===============

-Energy and Asset Technology, Inc. (EGTY) owns a global license to market
the genetically enhanced Global Cedar growth trees, with plans to 
REVOLUTIONIZE the forest-timber industry. 

These newly enhanced Globa| Cedar trees require only 9-12 years of growth 
before they can be harvested for lumber, whereas worldwide growth time for 
lumber is 30-50 years. 

Other than growing at an astonishing rate, the Global Cedar has a number 
of other benefits. Its natural elements make it resistant to termites, and 
the lack of oils and sap found in the wood make it resistant to forest fire, 
ensuring higher returns on investments.
T
he wood is very lightweight and strong, lighter than Poplar and over twice
as strong as Balsa, which makes it great for construction. It also has 
the unique ability to regrow itself from the stump, minimizing the land and
time to replant and develop new root systems.

Based on current resources and agreements, EGTY projects revenues of $140 
Million with an approximate profit margin of 40% for each 9-year cycle. With 
anticipated growth, EGTY is expected to challenge Deltic Timber Corp. during 
its initial 9-year cycle.

Deltic Timber Corp. currently trades at over $38.00 a share with about $153 
Million in revenues. As the reputation and demand for the Global Cedar tree 
continues to grow around the world EGTY believes additional multi-million 
dollar agreements will be forthcoming. The Global Cedar nursery has produced 
about 100,000 infant plants and is developing a production growth target of 
250,000 infant plants per month.

Energy and Asset Technology is currently in negotiations with land and business 
owners in New Zealand, Greece and Malaysia regarding the purchase of their popular 
and profitable fast growing infant tree plants. Inquiries from the governments of 
Brazil and Ecuador are also being evaluated.

Conclusion:

The examples above show the Awesome, Earning Potential of little
known Companies That Explode onto Investor�s Radar Screens. 
This s-t0ck will not be a Secret for long. Then You May Feel the Desire to Act Right 
Now! And Please Watch This One Trade!!


GO EGTY!


All statements made are our express opinion only and should be treated as such.
We may own, take position and sell any securities mentioned at any time. Any 
statements that express or involve discussions with respect to predictions, 
goals, expectations, beliefs, plans, projections, object'ives, assumptions or 
future events or perfo'rmance are not
statements of historical fact and may be 
"forward,|ooking statements." forward,|ooking statements are based on expectations, 
estimates and projections at the time the statements are made that involve a number 
of risks and uncertainties which could cause actual results or events to differ 
materially from those presently anticipated. This newsletter was paid $3,000 from 
third party (IR Marketing). Forward,|ooking statements in this action may be identified 
through the use of words such as: "pr0jects", "f0resee", "expects". in compliance with 
Se'ction 17. {b), we disclose the holding of EGTY shares prior to the publication of 
this report. Be aware of an inherent conflict of interest resulting from such holdings 
due to our intent to profit from the liquidation of these shares. Shar,es may be sold 
at any time, even after positive statements have been made regarding the above company. 
Since we own shares, there is an inherent conflict of interest in our statements and 
opinions. Readers of this publication are cautioned not 
to place undue reliance on 
forward,|ooking statements, which are based on certain assumptions and expectations 
involving various risks and uncertainties that could cause results to differ materially 
from those set forth in the forward- looking statements. This is not solicitation to 
buy or sell st-0cks, this text is or informational purpose only and you should seek 
professional advice from registered financial advisor before you do anything related 
with buying or selling st0ck-s, penny st'0cks are very high risk and you can lose your 
entire inves,tment.
"""

In [147]: post.encode('utf-8')

我得到了输出

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 319: ordinal not in range(128)
4

2 回答 2

3

Unicode 是一个表,它试图包含(所有)已知的字母、字符和符号,通常也称为字形。这有点超过 110000 意味着持有标志 atm。所以 DECODED 状态是这个表中的一个(代码)点。但是因为一个字节不能容纳超过 8bits = 256 个状态,所以您必须将 unicode 表示编码为字节流。最常用的编码技术是所谓的 UTF-8 ENCODING,它继承了旧的 ASCII ENCODING。UTF-8 编码允许使用一到四个字节对 Unicode 字形进行编码。

所以编码或解码总是从unicode或朝向unicode。如果要从一种编码转换为另一种编码,则必须通过 unicode 进行:

    [decode]     [encode]
ASCII ---> UNICODE ---> UTF-8
1 Glyph                 1 Glyph 
  =        1 Glyph        =
1 Byte                  1-4 Bytes

   unicode_str = mystring.decode('ascii')
   utf8_str = unicode_str.encode('utf-8')

(不是最好的例子,因为 ASCII 总是适合 utf-8)

所以如果你想解码你的post变量,你必须知道哪个编码有引用的字符串。在 python 2.x 中,它通常是 ASCII 编码的。在 python 3.x 中它应该是 UTF-8。

import sys
print sys.getdefaultencoding()

如果您的post-variable 未在源代码中定义,而是从外部字节流中读取,则您必须知道编码,否则您将不走运。

于 2013-10-09T11:24:20.423 回答
3

首先,通过将其作为文件的第二行(或者首先,如果您不使用 shebang)来告诉 Python 您正在使用什么编码:

# coding=utf-8

(见PEP 263

然后,不要使用字节字符串,而是始终对文本内容使用 unicode 文字:

post = u"""
='Brand New News Fr0m The Timber Industry!!'=
etc. etc. etc."""
于 2013-10-09T11:49:20.360 回答