java - Java String HEX 到 String ASCII 带重音符号

Question

我有字符串 String hex = "6174656ec3a7c3a36f";，我想得到String output = "atenção"但在我的测试中我只 String output = "aten????o"; 知道我做错了什么？

String hex = "6174656ec3a7c3a36f";
StringBuilder output = new StringBuilder();
for (int i = 0; i < hex.length(); i+=2) {
  String str = hex.substring(i, i+2);
  output.append((char)Integer.parseInt(str, 16));
} 

System.out.println(output); //here is the output "aten????o"

score 5 · Accepted Answer

考虑

String hex = "6174656ec3a7c3a36f";                                  // AAA
ByteBuffer buff = ByteBuffer.allocate(hex.length()/2);
for (int i = 0; i < hex.length(); i+=2) {
    buff.put((byte)Integer.parseInt(hex.substring(i, i+2), 16));
}
buff.rewind();
Charset cs = Charset.forName("UTF-8");                              // BBB
CharBuffer cb = cs.decode(buff);                                    // BBB
System.out.println(cb.toString());                                  // CCC

哪个打印：atenção

基本上，您的十六进制字符串表示以 UTF-8 编码时表示字符串 atenção 中字符的字节的十六进制编码。

解码：

你首先必须从你的十六进制字符串到字节（AAA）
然后从字节到字符（BBB）——这取决于编码，在你的情况下是 UTF-8。
从字符到字符串 (CCC)

score 5 · Accepted Answer

您的十六进制字符串似乎表示 UTF-8 字符串，而不是 ISO-8859-1。

我可以这样说的原因是，如果它是 ISO-8859-1，那么每个字符都有两个十六进制数字。您的十六进制字符串有 18 个字符，但您的预期输出只有 7 个字符。因此，十六进制字符串必须是可变宽度编码，而不是像 ISO-8859-1 那样每个字符一个字节。

以下程序产生输出：atenção

    String hex = "6174656ec3a7c3a36f";
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    for (int i = 0; i < hex.length(); i += 2) {
      String str = hex.substring(i, i + 2);
      int byteVal = Integer.parseInt(str, 16);
      baos.write(byteVal);
    } 
    String s = new String(baos.toByteArray(), Charset.forName("UTF-8"));

如果您更改UTF-8为ISO-8859-1，您将看到：atenÃ§Ã£o。

score 3 · Accepted Answer

Java 字符串是 Unicode ：每个字符都以16 位编码。你的字符串 - 我想 - 一个“C”字符串。您必须知道字符编码器的名称并使用CharsetDecoder。

import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;

public class Char8859_1Decoder {

   public static void main( String[] args ) throws CharacterCodingException {
      String hex = "6174656ec3a7c3a36f";
      int len = hex.length();
      byte[] cStr = new byte[len/2];
      for( int i = 0; i < len; i+=2 ) {
         cStr[i/2] = (byte)Integer.parseInt( hex.substring( i, i+2 ), 16 );
      }
      CharsetDecoder decoder = Charset.forName( "UTF-8" ).newDecoder();
      CharBuffer cb = decoder.decode( ByteBuffer.wrap( cStr ));
      System.out.println( cb.toString());
   }
}

score 2 · Accepted Answer

ç 和 ã 是 16 位字符，因此它们不像您在解码例程中假设的那样用一个字节表示，而是用一个完整的字表示。

我不会将每个字节转换为字符，而是将字节转换为 java Bytes，然后使用字符串例程将 Bytes 数组解码为字符串，从而使 java 完成确定解码例程的枯燥任务。

当然，java 可能猜错了，所以你可能需要提前知道编码是什么，根据@Aubin 或@Martin Ellis 给出的答案

java - Java String HEX 到 String ASCII 带重音符号

4 回答 4

Related

Reference