java - String 到 Long 的转换在 C 和 Java 中不同，为什么？

Question

我有一个像"01030920316". 当我要将此字符串转换为长字符串然后转换为字节时，它会在下面给出java的输出

output in java : Tag in bytes :  0, 0, 0, 0, 61, 114, -104, 124

当我得到这个输出时，我在 C 中做同样的事情

output in C : Tag in bytes : 124,152,114,61,0,0,0,0

在这里，我理解了有符号和无符号之间的区别，-104 and 152但为什么最初在 java 和 C 中都是 0。对于这种行为，当我的这个字节转到 C 程序端进行验证时，我遇到了问题。

请解释一下问题发生在哪里。

Java程序：

final byte[] tagBytes = ByteBuffer.allocate(8)
                .putLong(Long.parseLong("01030920316")).array();
System.out.println("Tag in bytes  >> " + Arrays.toString(tagBytes));

C程序：

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>

/** To access long long values as a byte array*/
typedef union uInt64ToByte__
{
    uint64_t m_Value;
    unsigned char m_ByteArray[8];

}uInt64ToByte;

int main()
{
    uInt64ToByte longLongToByteArrayUnion;
    longLongToByteArrayUnion.m_Value = atoll("01030920316");
    printf("%d,%d,%d,%d,%d,%d,%d,%d",longLongToByteArrayUnion.m_ByteArray[0],longLongToByteArrayUnion.m_ByteArray[1],longLongToByteArrayUnion.m_ByteArray[2],longLongToByteArrayUnion.m_ByteArray[3],longLongToByteArrayUnion.m_ByteArray[4],longLongToByteArrayUnion.m_ByteArray[5],longLongToByteArrayUnion.m_ByteArray[6],longLongToByteArrayUnion.m_ByteArray[7]);
    return 0;
}

score 14 · Accepted Answer

java中的输出：以字节为单位的标记：0、0、0、0、61、114、-104、124

Java 的 ByteBuffer 默认是 Big Endian，它的字节是有符号的，所以大于 127 的字节显示为负数。

C 中的输出：以字节为单位的标记：124,152,114,61,0,0,0,0

C 的数组使用本机字节字节序，这在 x86/x64 系统上是小字节序。的unsigned char范围为 0 到 255。

要在 Java 中产生与 C 相同的输出，您可以这样做

final byte[] tagBytes = ByteBuffer.allocate(8).order(ByteOrder.nativeOrder())
        .putLong(Long.parseLong("01030920316")).array();
int[] unsigned = new int[tagBytes.length];
for (int i = 0; i < tagBytes.length; i++)
    unsigned[i] = tagBytes[i] & 0xFF;
System.out.println("Tag in bytes  >> " + Arrays.toString(unsigned));

印刷

Tag in bytes  >> [124, 152, 114, 61, 0, 0, 0, 0]

score 4 · Accepted Answer

字符串在 Java 和 C 中的存储方式不同。您必须记住，用 C 编写的应用程序是本机的，而 Java 应用程序是在 Java 虚拟机中运行的。Java 字节码与平台无关，这就是为什么您的 Java 代码在所有操作系统/处理器架构上的行为都相同的原因。另一方面，存储字符的顺序在 C 中可能不同（编辑：在不同的架构上）

Edit2：假设我们有一个数字 109，即 1101101 二进制。为什么？1 * 64 + 1 * 32 + 0 * 16 + 1 * 8 + 1 * 4 + 0 * 2 + 1 * 1 = 109。最左边的位被称为“最重要的”，因为它的权重最大（2^ 6 = 64），最右边的位被称为“最低有效位”，因为它的权重最小（只有 1）。109 很无聊，因为它可以存储在单个字节中。假设我们有更大的东西，例如：1000，即 00000011 11101000 二进制。它存储在两个字节中（比如说 X 和 Y）。现在我们可以将该数字保存为 XY（大端）或 YX（小端）。在 big-endian 中，第一个字节（具有最低地址）是最高有效字节。在 little-endian 中，第一个字节是最低有效字节。x86 是小端，JVM 是大端。

score 2 · Accepted Answer

这是BigEndian 和 LittleEndian之间的区别。

当您将 C++ 上的数字转换为字节数组时，您会注意到底层系统是大端（首先存储多字节整数的最高有效字节）还是小端（首先存储最低有效字节）。

但另一方面，Java 通过始终使用大端序来隐藏底层系统的字节序。这是Java“一次编写——随处运行”理念的一部分。

score 0 · Accepted Answer

C++ 对其类型使用本机格式。Java 使用标准定义的格式，对应于 Sparc 上的原生格式，但不同于 PC 上的原生格式。

一般来说，对于非字符类型，没有理由假设字节的转储在两个不同的平台上是相同的，即使它们包含相同的值。（根据平台的不同，它们甚至可能没有相同的大小。我知道 C++ 中有 32、36、48 和 64 位的长整数；在 Java 中它们总是 64 位。）

score 0 · Accepted Answer

首先，为什么顺序似乎是颠倒的：这是因为putLongclass 的方法将字节以大端顺序ByteBuffer放入数组中。如果您希望它以小端顺序排列，请在 ByteBuffer 上设置顺序：

final byte[] tagBytes = ByteBuffer.allocate(8).order(ByteOrder.LITTLE_ENDIAN)
        .putLong(Long.parseLong("01030920316")).array();

其次，为什么-104在 Java 中你152在 C 中出现：那是因为在 C 中你使用的是unsigned char，而在 Java 中类型byte是有符号的，而不是无符号的。字节的内容实际上是相同的，但它显示为-104当您将其解释为有符号整数时，以及152当您将其解释为无符号整数时。

score 0 · Accepted Answer

由于 Java 的整数表示不依赖于平台，因此在进行比较时，我会将其作为参考，因此我更愿意创建 C 代码，将 C 表示整数的平台依赖性考虑在内。

在此之后，我建议使用以下 C 代码来根据 OP 创建字节打印输出：

#define _BSD_SOURCE  

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>

#if defined(__linux__)
#  include <endian.h>
#elif defined(__FreeBSD__) || defined(__NetBSD__)
#  include <sys/endian.h>
#elif defined(__OpenBSD__)
#  include <sys/types.h>
#  define be16toh(x) betoh16(x) /* -+ */
#  define be32toh(x) betoh32(x) /* -+--> not needed in this example */
#  define be64toh(x) betoh64(x) /* -+ */
#endif

int main()
{
  uint64_t uint64 = htobe64(atoll("01030920316")); /* convert to big endian/network byte order */

  for (int i = 0; i < sizeof(uint64); ++ i)
  {
    printf("%hhd, ", (signed char) (uint64 & 0xff));
    uint64 >>= 8;
  }

  printf("\n");

  return 0;
}

java - String 到 Long 的转换在 C 和 Java 中不同，为什么？

6 回答 6

Related

Reference