java - 按位运算

Question

我正在做一个类项目，即霍夫曼算法。读取文件并生成霍夫曼代码（1s & 0s）后，我必须使用按位运算将其导出到新文件。出于某种原因，当我使用按位运算导出时，文件最终会比以前更大。使用代表前面字符的 1 和 0 字符串，使用按位我必须将每个 1 和 0 保存在 8 位链中。这是我拥有的代码：

byte currentByte = 0;
for (int i = 0, j = 0; i < binaryString.length(); i++, j++) {
    if (binaryString.charAt(i) == '1') {
        currentByte |= (byte) Math.pow(2, 7 - j);
    }
    if (i != 0 && (i % 8 == 0 || i == binaryString.length() - 1)) {
        output.writeObject(currentByte);
        if (i % 8 == 0) {
             currentByte = 0;
             j = 0;
        }
    }
}

谢谢你。

score 0 · Accepted Answer

您正在使用ObjectOutputStream，它旨在用于 Java 对象的可移植序列化。如果你想写单个字节，你应该使用 aFileOutputStream代替。

score 0 · Accepted Answer

为什么你首先要生成一串 1 和 0？这是一个无用的额外步骤，只会花费额外的时间。

通常的做法是拥有一些方便位数的“缓冲区”（比如 32，因为那是一个int），为您编码的每个符号将可变数量的位写入该缓冲区，并从缓冲区中耗尽整个字节.

例如，（未测试，但我以前做过）

int buffer = 0, bufbits = 0;
for (int i = 0; i < symbols.length(); i++)
{
    int s = symbols[i];
    buffer <<= lengths[s];  // make room for the bits
    bufbits += lengths[s];  // buffer got longer
    buffer |= values[s];    // put in the bits corresponding to the symbol

    while (bufbits >= 8)    // as long as there is at least a byte in the buffer
    {
        bufbits -= 8;       // forget it's there
        stream.write((byte)(buffer >>> bufbits)); // and save it
        // note: bits are not removed from the buffer, just forgotten about
        // so it will "overflow", but that is harmless.
        // you will see weird values in the debugger though
    }
}

不要忘记循环结束时缓冲区中可能仍有一些东西。所以单独写出来。

某些格式要求包装是相反的，即缓冲区中的下一个符号位于前一个符号的前面。虽然这是一个简单的改变。

使用 32 位意味着最大符号长度为 32 - 7 = 25，这通常比已经设置在符号长度上的其他界限（通常为 15 或 16）要长。如果您需要更多，使用 a 的最大符号长度long为 57。非常长的长度在解码时不方便（因为使用了表 - 没有人真正通过逐位遍历树来解码），因此通常不使用它们。

score 0 · Accepted Answer

public static void main(String[] args) throws IOException
{
    FileOutputStream output = new FileOutputStream("C:\\temp\\t.dat");
    String inp = "1100110000110011";
    byte[] ar = new byte[1];
    int b = 0;
    int j = 0;
    int i = 0;
    while(i < inp.length())
    {
        if(inp.charAt(i) == '1')
            b |= 1 << (7-j);

        j++;
        i++;
        if(i % 8 == 0)
        {
            //StringBuilder sb = new StringBuilder();
            //sb.append(String.format("%02X ", b));
            //System.out.print(sb.toString());
            ar[0] = (byte)b;
            output.write(ar);
            j = 0;
            b = 0;
        }
    }
    output.close();
}

如果您编写更长的序列，您可能会考虑使用 aList<byte>然后附加每个字节，而不是单独写入每个字节。

score 0 · Accepted Answer

问题是您使用的是 writeObject 方法而不是 write 方法。

writeObject 方法写入有关对象以及对象本身的信息，其中 write 方法旨在简单地写入单个字节。

您还应该使用FileOutputStream而不是ObjectOutputStream。

请参阅：ObjectStream.write(byte)

score 0 · Accepted Answer

你需要改变if位置：

public static void main(String[] args) {
    String binaryString = "1111111100000010";
    byte currentByte = 0;
    for (int i = 0, j = 0; i < binaryString.length(); i++, j++) {
        if (i != 0 && i % 8 == 0 || i == binaryString.length() - 1) {
            System.out.println(currentByte); // for debug
            currentByte = 0;
            j = 0;
        }
        if (binaryString.charAt(i) == '1') {
            currentByte |= 1 << 7 - j;
        }
    }
}

二进制字符串的输出：

1
2

请注意，如果您有11111111，则这-1在byte类型中。

java - 按位运算

5 回答 5

Related

Reference