0

I'm trying to handle large files by 10MB byte arrays at a time. I'm trying to get the byte arrays one at a time (not get entire byte array for the huge file and split the byte array, after all the problem was due to memory)

This is what I have so far:

private byte[] readFile(File file, int offset) throws IOException
{
    BufferedInputStream inStream = null;
    ByteArrayOutputStream outStream = null;
    byte[] buf = new byte[1048576];
    int read = 0;

    try
    {
        inStream = new BufferedInputStream(new FileInputStream(file));
        outStream = new ByteArrayOutputStream();
        long skipped = inStream.skip(offset);
        read = inStream.read(buf);
        if (read != -1)
        {
            outStream.write(buf, 0, read);
            return outStream.toByteArray();
        }
    }
    finally
    {
        if (inStream != null) {try {inStream.close();} catch (IOException e) {}}
        if (outStream != null) {try {outStream.close();} catch (IOException e) {}}
    }

    return null;

the parameter offset will be in 10MB increments as well.

So the problem I'm having is that, even tho the skipped long variable gives me 1048576 bytes skipped, the second 10MB i'm suppose to receive from calling readFile(file, 1048576) is the same as the first byte array from the first 10MB. Thus it didn't really skip the first 10MB at all.

What's the problem here? Is there another way of implementing this idea?

4

2 回答 2

1

Redesign the method. At present you are copying byte arrays like its going out of style: once from the buffer to the ByteArrayOutoutStream and again from there to the return value. So you need three of those at once. Change the signature of the method, so that the caller provides the byte array as well as the offset, and the stream, and have it return the count. In other words get rid of it altogether and just call FileInputStream.read(buffer, offset, length) from wherever you are calling this.

于 2013-09-16T19:13:24.873 回答
0

So as per user @EJP I revised the code to work efficiently. I am no longer copying to ByteArrayOutputStream as I realized that .toByteArray actually returns a copy of the read byte array and is very memory inefficient. I also only opens the stream once so the skipping would be unneeded.

    int fileLength = (int) file.length();
    byte[] buffer = new byte[fileLength < FILE_UPLOAD_CHUNK_SIZE ?
            fileLength : FILE_UPLOAD_CHUNK_SIZE];
    int bytesRead;
    int readTotal = 0;
    BufferedInputStream inStream = null;
    try
    {
        inStream = new BufferedInputStream(new FileInputStream(file));
        do
        {
            bytesRead = inStream.read(buffer, 0, buffer.length);
            if (bytesRead == -1)
            {
                continue;
            }
            byte[] finalBuffer;
            if (buffer.length > bytesRead)
            {
                finalBuffer = Arrays.copyOf(buffer, bytesRead);
            }
            else
            {
                finalBuffer = buffer;
            }
            uploadChunk(
                    finalBuffer,
                    mimeType,
                    uploadPath,
                    fileLength,
                    readTotal,
                    readTotal + bytesRead - 1);
            readTotal += bytesRead;
        } while (bytesRead != -1);
    }
    finally
    {
        if (inStream != null)
        {
            inStream.close();
        }
    }

The only blemish I have for this code is the way I have to make a new copy of the byte array when the last chunk is less than 10MB. There should be a more efficient way of doing it but this is working fine for me for now.

于 2013-09-24T14:15:39.803 回答