3
4

3 回答 3

4

In Unicode there are character propertiesPHP Docs, for example Symbols, Letters and the like. You can search for any string of a specific class with preg_matchDocs and the u modifier.

echo preg_match('/pP$/u', $str);

However, your string needs to be UTF-8 to do that.

You can test this on your own, I created a little script that tests for all properties via preg_match:

Looking for properties of last character in "Test.":
Found Punctuation (P).
Found Other punctuation (Po).

Looking for properties of last character in "这是一个在中国的字符串。":
Found Punctuation (P).
Found Other punctuation (Po).

Related: PHP - Fast way to strip all characters not displayable in browser from utf8 string.

于 2011-10-05T13:23:05.053 回答
4
于 2011-10-05T13:25:27.353 回答
3

you are not trying to transliterate, you are trying to translate!

UTF-8 is not a language, is a unicode character set that supports (virtually) all languages known in the world

what you are trying to do is something like this:

echo iconv("UTF-8", "ASCII//TRANSLIT//IGNORE",  "这是一个在中国的字符串。");
echo iconv("UTF-8", "ASCII//TRANSLIT//IGNORE",  "à è ò ù");

that not works with your chinese example

于 2011-10-05T13:29:54.987 回答