1

以下问题:我有一个大文本文件,每行包含 13 个字节。我不想使用 InputStream 以常见的方式逐行读取文件。我正在尝试使用 NIO Channels 和 MappedByteBuffers 以获得更好的性能和有限的资源。

所以这就是我到目前为止所做的:

RandomAccessFile data = new RandomAccessFile("the_file.txt", "rw");
FileChannel channel = data.getChannel();
MappedByteBuffer buffer = channel.map(FileChannel.MapMode.READ_WRITE, 0, capacity);

这里容量是 n*13,以确保只有整行适合缓冲区。但这不起作用!我像这样填充缓冲区:

int bytesRead = channel.read(buffer);

但这并没有填满整个缓冲区!bytesRead不等于capacity,更糟糕的是,在我的情况下bytesRead%13为零,这意味着它不包含整行,最后有些东西被切断了。如何将一定数量的字节读入缓冲区?在我的情况下,我正好需要 n*13 字节,这样原始行就不会被分割......

4

3 回答 3

2

Taking a quick look at the documentation reveals the truth about the read method.

A read operation might not fill the buffer, and in fact it might not read any bytes at all.

From this it should be pretty clear that it cannot be assumed that the read call will fill the buffer. To achieve this you need to create a loop, checking how much is left to read lie this:

while(buffer.remaining() > 0) channel.read(buffer);

In the powerful java stream API all this is handled automatically.

I suggest using a simple BufferedReader and then measure the performance. Then you can take a more informed decision on trying again with the NIO classes. You will be surprised by the performance of the stream based classes. This solution will also give you code that is easier to maintain and read.

于 2011-11-19T14:04:28.423 回答
1

如果您使用的是MappedByteBuffer,那么您不妨一次性映射整个文件。Java 和 OS VM 系统会根据需要将数据从磁盘读取到内存。它不会一次将整个文件读入内存,除非它真的很小。然后,您可以专注于您的代码,仅访问您对每个循环/读取感兴趣的字节范围。

到目前为止,您更详细、更复杂的方法(以及此处的相应答案)更适合传统ByteBuffer的 ,您可以在其中明确控制从磁盘读入内存的内容。

于 2012-02-12T09:21:09.020 回答
1

如果您已经bytesRead%13!=0将新缓冲区映射到channel.map(FileChannel.MapMode.READ_WRITE, (bytesRead/13)*13, capacity);并且不处理bytesRead%13每个缓冲区的最后一个

于 2011-11-19T12:53:55.853 回答