java - 如何使用缓冲区从 FileInputStream 对象中读取特定数量的字节

Question

我有一系列对象存储在一个文件中，连接如下：

sizeOfFile1 || file1 || sizeOfFile2 || file2 ...

文件的大小是序列化的长对象，文件只是文件的原始字节。

我正在尝试从输入文件中提取文件。下面是我的代码：

FileInputStream fileInputStream = new FileInputStream("C:\Test.tst");
ObjectInputStream objectInputStream = new ObjectInputStream(fileInputStream);
while (fileInputStream.available() > 0)
{
  long size = (long) objectInputStream.readObject();
  FileOutputStream fileOutputStream = new FileOutputStream("C:\" + size + ".tst");
  BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream);
  int chunkSize = 256;
  final byte[] temp = new byte[chunkSize];
  int finalChunkSize = (int) (size % chunkSize);
  final byte[] finalTemp = new byte[finalChunkSize];
  while(fileInputStream.available() > 0 && size > 0)
  {
    if (fileInputStream.available() > finalChunkSize)
    {
      int i = fileInputStream.read(temp);
      secBufferedOutputStream.write(temp, 0, i);
      size = size - i;
    }
    else
    {
      int i = fileInputStream.read(finalTemp);
      secBufferedOutputStream.write(finalTemp, 0, i);
      size = 0;
    }
  }
  bufferedOutputStream.close();
}
fileOutputStream.close();

我的代码在读取第一个 sizeOfFile 后失败；当存储多个文件时，它只是将输入文件的其余部分读入一个文件。

任何人都可以在这里看到这个问题吗？

问候。

score 1 · Accepted Answer

将其包装在 a 中DataInputStream并使用readFully(byte[]).

但我质疑设计。序列化和随机访问不能混用。听起来您应该使用数据库。

注意你在滥用available(). 请参阅该方法的 Javadoc 页面。将它用作流中总字节数的计数是不正确的。很少有正确的用法available()，这不是其中之一。

score 0 · Accepted Answer

你可以试试NIO...

FileChannel roChannel = new RandomAccessFile(file, "r").getChannel();
ByteBuffer roBuf = roChannel.map(FileChannel.MapMode.READ_ONLY, 0, SIZE);

这只会从文件中读取 SIZE 个字节。

乙

score 0 · Accepted Answer

This is using DataInput to read longs. In this particular case I am not using readFully() as a segment might be too long to keep it in memory:

DataInputStream in = new DataInputStream(FileInputStream());
byte[] buf = new byte[64*1024];
while(true) {
  OutputStream out = ...;
  long size;
  try { size = in.readLong(); } catch (EOFException e) { break; } 
  while(size > 0) {
    int len = (size > buf.length)?buf.length:size;
    len = in.read(buf, 0, len);
    out.write(buf, 0, len);
    size-=len;
  }
  out.close();
}

score -1 · Accepted Answer

通过执行以下操作之一为自己省去很多麻烦：

切换到使用 Avro，相信我你会疯掉的。它很容易学习，并且可以适应模式的变化。使用 ObjectXXXStream 是有史以来最糟糕的想法之一，一旦您更改架构，您的旧文件就是垃圾。
或使用节俭
或者使用 Hibernate（但这可能不是一个很好的选择，hibernate 需要大量时间来学习，并且需要大量配置）

如果您真的拒绝切换到 avro，我建议您阅读 apache 的 IOUtils 类。它有一种从一个输入流复制到另一个输入流的方法，为您省去了很多麻烦。不幸的是，你想要做的有点复杂，你想要每个文件的大小前缀。您也许可以使用 SequenceInputStream 对象的组合来做到这一点。

还有 GzipOutputStream 和 ZipOutputStream，但我认为这些也需要将其他一些 jar 添加到您的类路径中。

我不打算写一个例子，因为我真的认为你应该学习 avro 或 thrift 并使用它。

java - 如何使用缓冲区从 FileInputStream 对象中读取特定数量的字节

4 回答 4

Related

Reference