RandomAccessFile 对于随机访问文件非常慢。您经常阅读有关在其上实现缓冲层的信息,但在网上找不到这样做的代码。
所以我的问题是:你们知道这个类的任何开源实现的人会共享一个指针还是共享你自己的实现?
如果这个问题会变成关于这个问题的有用链接和代码的集合,那就太好了,我敢肯定,许多人都共享这些问题,而 SUN 从未正确解决这些问题。
请不要参考 MemoryMapping,因为文件可能比 Integer.MAX_VALUE 大。
RandomAccessFile 对于随机访问文件非常慢。您经常阅读有关在其上实现缓冲层的信息,但在网上找不到这样做的代码。
所以我的问题是:你们知道这个类的任何开源实现的人会共享一个指针还是共享你自己的实现?
如果这个问题会变成关于这个问题的有用链接和代码的集合,那就太好了,我敢肯定,许多人都共享这些问题,而 SUN 从未正确解决这些问题。
请不要参考 MemoryMapping,因为文件可能比 Integer.MAX_VALUE 大。
您可以使用以下代码从 RandomAccessFile 制作 BufferedInputStream,
RandomAccessFile raf = ...
FileInputStream fis = new FileInputStream(raf.getFD());
BufferedInputStream bis = new BufferedInputStream(fis);
一些注意事项
可能你想要使用它的方式是这样的,
RandomAccessFile raf = ...
FileInputStream fis = new FileInputStream(raf.getFD());
BufferedInputStream bis = new BufferedInputStream(fis);
//do some reads with buffer
bis.read(...);
bis.read(...);
//seek to a a different section of the file, so discard the previous buffer
raf.seek(...);
bis = new BufferedInputStream(fis);
bis.read(...);
bis.read(...);
好吧,即使文件比 Integer.MAX_VALUE 大,我也没有理由不使用 java.nio.MappedByteBuffer。
显然,您将不允许为整个文件定义单个 MappedByteBuffer。但是您可以让多个 MappedByteBuffers 访问文件的不同区域。
FileChannenel.map 中位置和大小的定义是 long 类型,这意味着您可以提供超过 Integer.MAX_VALUE 的值,唯一需要注意的是缓冲区的大小不会大于 Integer.MAX_VALUE .
因此,您可以像这样定义几个映射:
buffer[0] = fileChannel.map(FileChannel.MapMode.READ_WRITE,0,2147483647L);
buffer[1] = fileChannel.map(FileChannel.MapMode.READ_WRITE,2147483647L, Integer.MAX_VALUE);
buffer[2] = fileChannel.map(FileChannel.MapMode.READ_WRITE, 4294967294L, Integer.MAX_VALUE);
...
总之,大小不能大于 Integer.MAX_VALUE,但起始位置可以是文件中的任何位置。
在 Book Java NIO中,作者 Ron Hitchens 指出:
通过内存映射机制访问文件比通过传统方式读取或写入数据效率更高,即使使用通道也是如此。不需要进行显式系统调用,这可能很耗时。更重要的是,操作系统的虚拟内存系统会自动缓存内存页面。这些页面将使用系统内存进行缓存,并且不会占用 JVM 内存堆中的空间。
一旦内存页面变为有效(从磁盘引入),就可以以全硬件速度再次访问它,而无需进行另一个系统调用来获取数据。包含经常引用或更新的索引或其他部分的大型结构化文件可以从内存映射中受益匪浅。当与文件锁定结合以保护关键部分和控制事务原子性时,您开始了解如何充分利用内存映射缓冲区。
我真的怀疑你会发现第三方 API 做得比这更好。也许您会发现在此架构之上编写的 API 可以简化工作。
你不认为这种方法应该对你有用吗?
RandomAccessFile 对于随机访问文件非常慢。您经常阅读有关在其上实现缓冲层的信息,但在网上找不到这样做的代码。
嗯,网上可以查到。
一方面,jpeg2000 中的 JAI 源代码有一个实现,以及一个更加不受约束的 impl,位于:
http ://www.unidata.ucar.edu/software/netcdf-java/
文档:
如果您在 64 位机器上运行,那么内存映射文件是您最好的方法。只需将整个文件映射到一个大小相等的缓冲区数组中,然后根据需要为每个记录选择一个缓冲区(即edalorzo的答案,但是您想要重叠缓冲区,这样您就没有跨越边界的记录)。
如果您在 32 位 JVM 上运行,那么您会遇到RandomAccessFile
. 但是,您可以使用它来读取byte[]
包含整个记录的 a,然后使用 aByteBuffer
从该数组中检索单个值。在最坏的情况下,您应该需要进行两次文件访问:一次检索记录的位置/大小,一次检索记录本身。
但是,请注意,如果您创建大量byte[]
s,您可能会开始对垃圾收集器施加压力,并且如果您在整个文件中反弹,您将保持 IO 绑定。
Apache PDFBox 项目有一个很好且经过测试的BufferedRandomAccessFile
类。
根据 Apache 许可证 2.0 版获得许可
它是JavaWorld.com上 Nick Zhang 所描述的java.io.RandomAccessFile类的优化版本。基于jmzreader实现并增强以处理无符号字节。
在此处查看源代码:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.RandomAccessFile;
/**
* Adds caching to a random access file.
*
* Rather than directly writing down to disk or to the system which seems to be
* what random access file/file channel do, add a small buffer and write/read from
* it when possible. A single buffer is created, which means reads or writes near
* each other will have a speed up. Read/writes that are not within the cache block
* will not be speed up.
*
*
*/
public class BufferedRandomAccessFile implements AutoCloseable {
private static final int DEFAULT_BUFSIZE = 4096;
/**
* The wrapped random access file, we will hold a cache around it.
*/
private final RandomAccessFile raf;
/**
* The size of the buffer
*/
private final int bufsize;
/**
* The buffer.
*/
private final byte buf[];
/**
* Current position in the file.
*/
private long pos = 0;
/**
* When the buffer has been read, this tells us where in the file the buffer
* starts at.
*/
private long bufBlockStart = Long.MAX_VALUE;
// Must be updated on write to the file
private long actualFileLength = -1;
boolean changeMadeToBuffer = false;
// Must be update as we write to the buffer.
private long virtualFileLength = -1;
public BufferedRandomAccessFile(File name, String mode) throws FileNotFoundException {
this(name, mode, DEFAULT_BUFSIZE);
}
/**
*
* @param file
* @param mode how to open the random access file.
* @param b size of the buffer
* @throws FileNotFoundException
*/
public BufferedRandomAccessFile(File file, String mode, int b) throws FileNotFoundException {
this(new RandomAccessFile(file, mode), b);
}
public BufferedRandomAccessFile(RandomAccessFile raf) throws FileNotFoundException {
this(raf, DEFAULT_BUFSIZE);
}
public BufferedRandomAccessFile(RandomAccessFile raf, int b) {
this.raf = raf;
try {
this.actualFileLength = raf.length();
} catch (IOException e) {
throw new RuntimeException(e);
}
this.virtualFileLength = actualFileLength;
this.bufsize = b;
this.buf = new byte[bufsize];
}
/**
* Sets the position of the byte at which the next read/write should occur.
*
* @param pos
* @throws IOException
*/
public void seek(long pos) throws IOException{
this.pos = pos;
}
/**
* Sets the length of the file.
*/
public void setLength(long fileLength) throws IOException {
this.raf.setLength(fileLength);
if(fileLength < virtualFileLength) {
virtualFileLength = fileLength;
}
}
/**
* Writes the entire buffer to disk, if needed.
*/
private void writeBufferToDisk() throws IOException {
if(!changeMadeToBuffer) return;
int amountOfBufferToWrite = (int) Math.min((long) bufsize, virtualFileLength - bufBlockStart);
if(amountOfBufferToWrite > 0) {
raf.seek(bufBlockStart);
raf.write(buf, 0, amountOfBufferToWrite);
this.actualFileLength = virtualFileLength;
}
changeMadeToBuffer = false;
}
/**
* Flush the buffer to disk and force a sync.
*/
public void flush() throws IOException {
writeBufferToDisk();
this.raf.getChannel().force(false);
}
/**
* Based on pos, ensures that the buffer is one that contains pos
*
* After this call it will be safe to write to the buffer to update the byte at pos,
* if this returns true reading of the byte at pos will be valid as a previous write
* or set length has caused the file to be large enough to have a byte at pos.
*
* @return true if the buffer contains any data that may be read. Data may be read so long as
* a write or the file has been set to a length that us greater than the current position.
*/
private boolean readyBuffer() throws IOException {
boolean isPosOutSideOfBuffer = pos < bufBlockStart || bufBlockStart + bufsize <= pos;
if (isPosOutSideOfBuffer) {
writeBufferToDisk();
// The buffer is always positioned to start at a multiple of a bufsize offset.
// e.g. for a buf size of 4 the starting positions of buffers can be at 0, 4, 8, 12..
// Work out where the buffer block should start for the given position.
long bufferBlockStart = (pos / bufsize) * bufsize;
assert bufferBlockStart >= 0;
// If the file is large enough, read it into the buffer.
// if the file is not large enough we have nothing to read into the buffer,
// In both cases the buffer will be ready to have writes made to it.
if(bufferBlockStart < actualFileLength) {
raf.seek(bufferBlockStart);
raf.read(buf);
}
bufBlockStart = bufferBlockStart;
}
return pos < virtualFileLength;
}
/**
* Reads a byte from the file, returning an integer of 0-255, or -1 if it has reached the end of the file.
*
* @return
* @throws IOException
*/
public int read() throws IOException {
if(readyBuffer() == false) {
return -1;
}
try {
return (buf[(int)(pos - bufBlockStart)]) & 0x000000ff ;
} finally {
pos++;
}
}
/**
* Write a single byte to the file.
*
* @param b
* @throws IOException
*/
public void write(byte b) throws IOException {
readyBuffer(); // ignore result we don't care.
buf[(int)(pos - bufBlockStart)] = b;
changeMadeToBuffer = true;
pos++;
if(pos > virtualFileLength) {
virtualFileLength = pos;
}
}
/**
* Write all given bytes to the random access file at the current possition.
*
*/
public void write(byte[] bytes) throws IOException {
int writen = 0;
int bytesToWrite = bytes.length;
{
readyBuffer();
int startPositionInBuffer = (int)(pos - bufBlockStart);
int lengthToWriteToBuffer = Math.min(bytesToWrite - writen, bufsize - startPositionInBuffer);
assert startPositionInBuffer + lengthToWriteToBuffer <= bufsize;
System.arraycopy(bytes, writen,
buf, startPositionInBuffer,
lengthToWriteToBuffer);
pos += lengthToWriteToBuffer;
if(pos > virtualFileLength) {
virtualFileLength = pos;
}
writen += lengthToWriteToBuffer;
this.changeMadeToBuffer = true;
}
// Just write the rest to the random access file
if(writen < bytesToWrite) {
writeBufferToDisk();
int toWrite = bytesToWrite - writen;
raf.write(bytes, writen, toWrite);
pos += toWrite;
if(pos > virtualFileLength) {
virtualFileLength = pos;
actualFileLength = virtualFileLength;
}
}
}
/**
* Read up to to the size of bytes,
*
* @return the number of bytes read.
*/
public int read(byte[] bytes) throws IOException {
int read = 0;
int bytesToRead = bytes.length;
while(read < bytesToRead) {
//First see if we need to fill the cache
if(readyBuffer() == false) {
//No more to read;
return read;
}
//Now read as much as we can (or need from cache and place it
//in the given byte[]
int startPositionInBuffer = (int)(pos - bufBlockStart);
int lengthToReadFromBuffer = Math.min(bytesToRead - read, bufsize - startPositionInBuffer);
System.arraycopy(buf, startPositionInBuffer, bytes, read, lengthToReadFromBuffer);
pos += lengthToReadFromBuffer;
read += lengthToReadFromBuffer;
}
return read;
}
public void close() throws IOException {
try {
this.writeBufferToDisk();
} finally {
raf.close();
}
}
/**
* Gets the length of the file.
*
* @return
* @throws IOException
*/
public long length() throws IOException{
return virtualFileLength;
}
}