java - getBytes("UTF-8")、getBytes("windows-1252") 和 getBytes() 有什么区别？

Question

我有以下代码会产生令人困惑的输出..

import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;

    public class Main {

        String testString = "Moage test String";

        public static void main(String[] args) {
            new Main();
        }

        public Main(){

            System.out.println("Default charset: "+Charset.defaultCharset());
            System.out.println("Teststring: "+testString);
            System.out.println();
            System.out.println("get the byteStreeam of the test String...");
            System.out.println();
            System.out.println("Bytestream with default encoding: ");
            for(int i = 0; i < testString.getBytes().length; i++){
                System.out.print(testString.getBytes()[i]);
            }
            System.out.println();
            System.out.println();
            System.out.println("Bytestream with encoding UTF-8: ");
            try {
                for(int i = 0; i < testString.getBytes("UTF-8").length; i++){
                    System.out.print(testString.getBytes("UTF-8")[i]);
                }
                System.out.println();
                System.out.println();
                System.out.println("Bytestream with encoding windows-1252 (default): ");

                for(int i = 0; i < testString.getBytes("windows-1252").length; i++){
                    System.out.print(testString.getBytes("windows-1252")[i]);
                }

                System.out.println();
                System.out.println();
                System.out.println("Bytestream with encoding UTF-16: ");

                for(int i = 0; i < testString.getBytes("UTF-16").length; i++){
                    System.out.print(testString.getBytes("UTF-16")[i]);
                }

            } catch (UnsupportedEncodingException e) {
                e.printStackTrace();
            }
        }
    }

所以想看看utf-8编码和windows-1252的区别。但是当我查看输出时，似乎没有区别。只有当我使用 utf-16 对 windows-1252 进行 cdompare 时，才会有区别。

输出：

> Default charset: windows-1252 Teststring: Moage test String
> 
> get the byteStreeam of the test String...
> 
> Bytestream with default encoding: 
> 7711197103101321161011151163283116114105110103
> 
> Bytestream with encoding UTF-8: 
> 7711197103101321161011151163283116114105110103
> 
> Bytestream with encoding windows-1252 (default): 
> 7711197103101321161011151163283116114105110103
> 
> Bytestream with encoding UTF-16: 
> -2-1077011109701030101032011601010115011603208301160114010501100103

谁能解释一下为什么utf-8 和 windows-1252 看起来一样？

干杯亚历克斯

score 3 · Accepted Answer

这是因为您只ASCII在测试String中使用了您的情况下"Moage test String"的字符，尝试使用特殊字符"éèà"，例如，您将看到不同的结果。

score 0 · Accepted Answer

这里，

您使用了属于的范围的字符串字符ASCII。如果您的字符串包含任何支持特殊字符的特殊字符或语言，您的字节输出将被更改。

UTF-8 是普遍认可的标准，适用于任何地方。但是，Windows-any 编码是特定于 Windows 的，不保证可以在任何机器上工作。

java - getBytes("UTF-8")、getBytes("windows-1252") 和 getBytes() 有什么区别？

2 回答 2

Related

Reference