2
4

2 回答 2

3

ISO-8859-1 is a one-byte-per-character encoding. The fancy Unicode double-quotes are not in the ISO-8859-1 character set. So what you are seeing is a multi-byte character represented as a sequence of ISO-8859-1 bytes.

To match these weird things, see the perlunicode man page, especially the \x{...} and \N{...} escape sequences.

To answer your question, try \x{201C} to match the Unicode LEFT DOUBLE QUOTATION MARK and \x{201D} to match the RIGHT DOUBLE QUOTATION MARK. You missed the latter in your question :-).

[update]

I should have provided my reference... Some nice gentleman in the UK has a page on ASCII and Unicode quotation marks. The plain vanilla ASCII/ISO-8859-1 double-quote is just called QUOTATION MARK.

于 2011-06-11T00:12:32.507 回答
-1

May be this Old post will help..

于 2011-06-14T09:46:06.770 回答