73

I was trying to read a file into an array by using FileInputStream, and an ~800KB file took about 3 seconds to read into memory. I then tried the same code except with the FileInputStream wrapped into a BufferedInputStream and it took about 76 milliseconds. Why is reading a file byte by byte done so much faster with a BufferedInputStream even though I'm still reading it byte by byte? Here's the code (the rest of the code is entirely irrelevant). Note that this is the "fast" code. You can just remove the BufferedInputStream if you want the "slow" code:

InputStream is = null;

    try {
        is = new BufferedInputStream(new FileInputStream(file));

        int[] fileArr = new int[(int) file.length()];

        for (int i = 0, temp = 0; (temp = is.read()) != -1; i++) {
            fileArr[i] = temp;
        }

BufferedInputStream is over 30 times faster. Far more than that. So, why is this, and is it possible to make this code more efficient (without using any external libraries)?

4

3 回答 3

129

FileInputStream中,该方法read()读取单个字节。从源代码:

/**
 * Reads a byte of data from this input stream. This method blocks
 * if no input is yet available.
 *
 * @return     the next byte of data, or <code>-1</code> if the end of the
 *             file is reached.
 * @exception  IOException  if an I/O error occurs.
 */
public native int read() throws IOException;

这是对使用磁盘读取单个字节的操作系统的本机调用。这是一项繁重的操作。

使用 a BufferedInputStream,该方法委托给一个重载read()方法,该方法读取8192字节数量并缓冲它们,直到需要它们为止。它仍然只返回单个字节(但保留其他字节)。这样,BufferedInputStream就可以减少对操作系统的本机调用以从文件中读取。

例如,您的文件是32768字节长。要使用 a 获取内存中的所有字节FileInputStream,您需要32768对操作系统进行本机调用。使用 a BufferedInputStream,您将只需要4,而不管read()您将拨打多少电话(仍然32768)。

至于如何让它更快,你可能想考虑 Java 7 的 NIOFileChannel类,但我没有证据支持这一点。


注意:如果您直接使用FileInputStream'read(byte[], int, int)方法,则byte[>8192]不需要 aBufferedInputStream包装它。

于 2013-09-03T19:51:21.187 回答
3

包裹在 FileInputStream 周围的 BufferedInputStream 会以大块的形式从 FileInputStream 请求数据(我认为默认为 512 字节左右。)因此,如果您一次读取 1000 个字符,则 FileInputStream 只需访问磁盘两次. 这会快得多!

于 2013-09-03T19:51:27.460 回答
1

这是因为磁盘访问的成本。假设您将拥有一个大小为 8kb 的文件。在没有 BufferedInputStream 的情况下读取这个文件需要 8*1024 次访问磁盘。

此时,BufferedStream 就出现了,它充当了 FileInputStream 和要读取的文件之间的中间人。

一口气,将获得默认为 8kb 到内存的字节块,然后 FileInputStream 将从这个中间人那里读取字节。这将减少操作的时间。

private void exercise1WithBufferedStream() {
      long start= System.currentTimeMillis();
        try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
            BufferedInputStream bufferedInputStream = new BufferedInputStream(myFile);
            boolean eof = false;
            while (!eof) {
                int inByteValue = bufferedInputStream.read();
                if (inByteValue == -1) eof = true;
            }
        } catch (IOException e) {
            System.out.println("Could not read the stream...");
            e.printStackTrace();
        }
        System.out.println("time passed with buffered:" + (System.currentTimeMillis()-start));
    }


    private void exercise1() {
        long start= System.currentTimeMillis();
        try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
            boolean eof = false;
            while (!eof) {
                int inByteValue = myFile.read();
                if (inByteValue == -1) eof = true;
            }
        } catch (IOException e) {
            System.out.println("Could not read the stream...");
            e.printStackTrace();
        }
        System.out.println("time passed without buffered:" + (System.currentTimeMillis()-start));
    }
于 2017-07-12T19:28:23.780 回答