2

当我运行以下程序时:

public static void main(String args[]) throws Exception
{
    byte str[] = {(byte)0xEC, (byte)0x96, (byte)0xB4};
    String s = new String(str, "UTF-8");
}

在 Linux 上检查 jdb 中 s 的值,我正确地得到:

 s = "ì–´"

在 Windows 上,我错误地得到:

s = "?"

我的字节序列在韩语中是一个有效的 UTF-8 字符,为什么它会产生两个截然不同的结果?

4

4 回答 4

3
于 2012-10-02T21:22:16.447 回答
1
于 2012-10-02T21:20:11.713 回答
1

You get the correct string, it's Windows console that does not display the string correctly.

Here is a link to an article that discusses a way to make Java console produce correct Unicode output using JNI.

于 2012-10-02T21:21:20.370 回答
0

JDB is displaying the data incorrectly. The code works the same on both Windows and Linux. Try running this more definitive test:

public static void main(String[] args) throws Exception {
    byte str[] = {(byte)0xEC, (byte)0x96, (byte)0xB4};
    String s = new String(str, "UTF-8"); 
    for(int i=0; i<s.length(); i++) {
        System.out.println(BigInteger.valueOf((int)s.charAt(i)).toString(16));
    }
}

This prints out the hex value of every character in the string. This will correctly print out "c5b4" in both Windows and Linux.

于 2012-10-02T21:35:51.750 回答