java - 字符串十六进制编码和解码

Question

我正在将字符串从 UTF-8 转换为 CP1047，然后对其执行十六进制编码，效果很好。接下来我正在做的是转换回来，使用解码十六进制字符串并以 UTF-8 格式在控制台上显示它。问题是我没有得到传递给编码方法的正确字符串。下面是我编写的一段代码：

public class HexEncodeDecode {

    public static void main(String[] args) throws UnsupportedEncodingException,
            DecoderException {
        String reqMsg = "ISO0150000150800C220000080000000040000050000000215102190000000014041615141800001427690161 0B0    000123450041234";
        char[] hexed = getHex(reqMsg, "UTF-8", "Cp1047");

        System.out.println(hexed);

        System.out.println(getString(hexed));
    }

    public static char[] getHex(String source, String inputCharacterCoding,
            String outputCharacterCoding) throws UnsupportedEncodingException {
        return Hex.encodeHex(new String(source.getBytes(inputCharacterCoding),
                outputCharacterCoding).getBytes(), false);
    }

    public static String getString(char[] source) throws DecoderException,
            UnsupportedEncodingException {
        return new String(Hex.decodeHex(source), Charset.forName("UTF-8"));
    }
}

我得到的输出是：

    C3B1C3AB7CC290C291C295C290C290C290C290C291C295C290C298C290C290C3A41616C290C290C290C290C290C298C290C290C290C290C290C290C290C290C294C290C290C290C290C290C295C290C290C290C290C290C290C29016C291C295C291C29016C291C299C290C290C290C290C290C290C290C290C291C294C290C294C291C296C291C295C291C294C291C298C290C290C290C290C291C2941604C296C299C290C291C296C291C280C290C3A2C290C280C280C280C280C290C290C290C29116C293C294C295C290C290C294C29116C293C294
ñë|äâ

因此，在打印输入字符串时需要帮助。

预期输出为：

C3B1C3AB7CC290C291C295C290C290C290C290C291C295C290C298C290C290C3A41616C290C290C290C290C290C298C290C290C290C290C290C290C290C290C294C290C290C290C290C290C295C290C290C290C290C290C290C29016C291C295C291C29016C291C299C290C290C290C290C290C290C290C290C291C294C290C294C291C296C291C295C291C294C291C298C290C290C290C290C291C2941604C296C299C290C291C296C291C280C290C3A2C290C280C280C280C280C290C290C290C29116C293C294C295C290C290C294C29116C293C294
ISO0150000150800C220000080000000040000050000000215102190000000014041615141800001427690161 0B0    000123450041234

score 5 · Accepted Answer

new String(source.getBytes(inputCharacterCoding), outputCharacterCoding)
    .getBytes()

这可能不会像您认为的那样做。

首先要做的事情：aString没有 encoding。跟我重复一遍：a Stringhas no encoding。

AString只是一个旨在表示字符的标记序列。碰巧Java为此使用了一个chars序列。它们也可以是信鸽。

UTF8、CP1047等只是字符编码；可以执行两个操作：

encoding：将信鸽（chars）流转成字节流；
解码：将字节流转换为信鸽（chars）流。

基本上，您的基本假设是错误的；您不能将编码与String. 您的真实输入应该是一个byte流（通常不是字节数组），您知道它是特定编码（在您的情况下为 UTF-8）的结果，您想使用另一个字符集重新编码（在您的情况下），CP1047）。

这里真正答案的“秘密”将是您的Hex.encodeHex()方法的代码，但您没有显示它，所以这是我可以召集的一个很好的答案。

score 1 · Accepted Answer

快速修复（虽然有点难看）将更getString()改为：

public static String getString(char[] source) throws DecoderException, UnsupportedEncodingException {
        return new String(new String(Hex.decodeHex(source), Charset.forName("UTF-8")).getBytes("Cp1047"),"UTF-8");
}

正如 fge 已经提到的，您在 chars 和 bytes 之间切换转换，它们是不同的鞋子。因此，在这个快速解决方案中，您首先假设 UTF-8 进行十六进制解码，然后将其编码为 Cp1047 字节数组，最后使用 UTF-8 字符集将其解码回字符串。

正如我已经说过的，这只是一种快速的单行解决方法，而不是最干净的解决方案，因为错误已经在十六进制编码期间完成。

score 1 · Accepted Answer

reqMsg不再具有编码，因此尝试将其从 UTF-8 转换为“Cp1047”是没有意义的（并且具有破坏性）。

如果reqMsg将来来自外部来源，例如来自磁盘或网络，那么您将不得不解码 - 也许这就是混乱的来源。也许你会做：UTF-8->Unicode(String)->CP1047->HEX。当您将其写入标准输出时，HEX 很可能是 ASCII 编码的。

以下示例在转换为 CP1047 (Unicode->CP1047->HEX) 后基于您的原始字符串创建一个 ASCII 十六进制字符串：

    String reqMsg = "ISO0150000150800C220000080000000040000050000000215102190000000014041615141800001427690161 0B0    000123450041234";

    // encode to cp1047 represented as Hex
    byte[] reqMsqBytes = reqMsg.getBytes("Cp1047");
    char[] hex = Hex.encodeHex(reqMsqBytes);   
    System.out.println(hex);

    // decode
    String respMsqBytes = new String(Hex.decodeHex(hex), "Cp1047");
    System.out.println(respMsqBytes);

java - 字符串十六进制编码和解码

3 回答 3

Related

Reference