1

Just look at the code bellow

try {
        String str = "上海上海";
        String gb2312 = new String(str.getBytes("utf-8"), "gb2312");
        String utf8 = new String(gb2312.getBytes("gb2312"), "utf-8");
        System.out.println(str.equals(utf8));
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }

print false!!!

I run this code both under jdk7 and jdk8 and my code style of IDE is utf8.

Can anyone help me?

4

2 回答 2

0

您正在寻找的是输出/输入时的编码/解码。

正如@kalpesh 所说,在内部,它都是unicode。如果要以特定编码读取流,然后将其写入不同的流,则必须指定字节(在流中)和字符串(在 java 中)之间以及字符串(在java) 到字节(输出流),如下所示:

        InputStream is = new FileInputStream("utf8_encoded_text.txt");
        OutputStream os = new FileOutputStream("gb2312_encoded.txt");

        Reader r = new InputStreamReader(is,"utf-8");
        BufferedReader br = new BufferedReader(r);
        Writer w = new OutputStreamWriter(os, "gb2312");
        BufferedWriter bw = new BufferedWriter(w);

        String s=null;
        while((s=br.readLine())!=null) {
            bw.write(s);
        }
        br.close();
        bw.close();
        os.flush();

当然,您仍然需要进行适当的异常处理以确保所有内容都正确关闭。

于 2015-11-05T02:43:26.867 回答
0
        String gb2312 = new String(str.getBytes("utf-8"), "gb2312");

此语句不正确,因为 String 构造函数应该采用匹配的字节数组和字符集,您说字节是 utf-8 但字符集是 gb2312

于 2015-11-05T02:30:31.013 回答