简短的回答,这样做:
public static String readFile( String filePath ) throws IOException
{
Reader reader = new FileReader( filePath );
StringBuilder sb = new StringBuilder();
char buffer[] = new char[16384]; // read 16k blocks
int len; // how much content was read?
while( ( len = reader.read( buffer ) ) > 0 ){
sb.append( buffer, 0, len );
}
reader.close();
return sb.toString();
}
它非常简单,非常快,并且适用于不合理的大型文本文件(100+ MB)
长答案:
(代码在最后)
很多时候没关系,但是这种方法非常快且可读性强。事实上,它的复杂度比@Raceimation 的答案要快——O(n) 而不是 O(n^2)。
我测试了六种方法(从慢到快):
- concat:逐行读取,使用 str += 进行 concat ... *即使对于较小的文件,这也非常慢(对于 3MB 文件大约需要 70 秒)*
- strbuilder 猜测长度:StringBuilder,用文件大小初始化。我猜它很慢,因为它真的试图找到这么一大块线性内存。
- 带行缓冲区的strbuilder:StringBuilder,逐行读取文件
- strbuffer 与 char[] 缓冲区:与 StringBuffer 连接,以 16k 块读取文件
- strbuilder with char[] buffer: Concat with StringBuilder, 以 16k 块读取文件
- 预分配 byte[filesize] 缓冲区:分配一个 byte[] 缓冲区文件大小,并让 java api 决定如何缓冲单个块。
结论:
完全预分配缓冲区对于非常大的文件是最快的,但该方法不是很通用,因为必须提前知道总文件大小。这就是为什么我建议将 strBuilder 与 char[] 缓冲区一起使用,它仍然很简单,并且如果需要可以轻松更改以接受任何输入流而不仅仅是文件。然而,对于所有合理的情况,它肯定足够快。
测试结果+代码
import java.io.*;
public class Test
{
static final int N = 5;
public final static void main( String args[] ) throws IOException{
test( "1k.txt", true );
test( "10k.txt", true );
// concat with += would take ages here, so we skip it
test( "100k.txt", false );
test( "2142k.txt", false );
test( "pruned-names.csv", false );
// ah, what the heck, why not try a binary file
test( "/Users/hansi/Downloads/xcode46graphicstools6938140a.dmg", false );
}
public static void test( String file, boolean includeConcat ) throws IOException{
System.out.println( "Reading " + file + " (~" + (new File(file).length()/1024) + "Kbytes)" );
strbuilderwithchars( file );
strbuilderwithchars( file );
strbuilderwithchars( file );
tick( "Warm up... " );
if( includeConcat ){
for( int i = 0; i < N; i++ )
concat( file );
tick( "> Concat with += " );
}
else{
tick( "> Concat with += **skipped** " );
}
for( int i = 0; i < N; i++ )
strbuilderguess( file );
tick( "> StringBuilder init with length " );
for( int i = 0; i < N; i++ )
strbuilder( file );
tick( "> StringBuilder with line buffer " );
for( int i = 0; i < N; i++ )
strbuilderwithchars( file );
tick( "> StringBuilder with char[] buffer" );
for( int i = 0; i < N; i++ )
strbufferwithchars( file );
tick( "> StringBuffer with char[] buffer " );
for( int i = 0; i < N; i++ )
singleBuffer( file );
tick( "> Allocate byte[filesize] " );
System.out.println();
}
public static long now = System.currentTimeMillis();
public static void tick( String message ){
long t = System.currentTimeMillis();
System.out.println( message + ": " + ( t - now )/N + " ms" );
now = t;
}
// StringBuilder with char[] buffer
// + works if filesize is unknown
// + pretty fast
public static String strbuilderwithchars( String filePath ) throws IOException
{
Reader reader = new FileReader( filePath );
StringBuilder sb = new StringBuilder();
char buffer[] = new char[16384]; // read 16k blocks
int len; // how much content was read?
while( ( len = reader.read( buffer ) ) > 0 ){
sb.append( buffer, 0, len );
}
reader.close();
return sb.toString();
}
// StringBuffer with char[] buffer
// + works if filesize is unknown
// + faster than stringbuilder on my computer
// - should be slower than stringbuilder, which confuses me
public static String strbufferwithchars( String filePath ) throws IOException
{
Reader reader = new FileReader( filePath );
StringBuffer sb = new StringBuffer();
char buffer[] = new char[16384]; // read 16k blocks
int len; // how much content was read?
while( ( len = reader.read( buffer ) ) > 0 ){
sb.append( buffer, 0, len );
}
reader.close();
return sb.toString();
}
// StringBuilder init with length
// + works if filesize is unknown
// - not faster than any of the other methods, but more complicated
public static String strbuilderguess(String filePath) throws IOException
{
File file = new File( filePath );
BufferedReader reader = new BufferedReader(new FileReader(file));
String line;
StringBuilder sb = new StringBuilder( (int)file.length() );
while( ( line = reader.readLine() ) != null)
{
sb.append( line );
}
reader.close();
return sb.toString();
}
// StringBuilder with line buffer
// + works if filesize is unknown
// + pretty fast
// - speed may (!) vary with line length
public static String strbuilder(String filePath) throws IOException
{
BufferedReader reader = new BufferedReader(new FileReader(filePath));
String line;
StringBuilder sb = new StringBuilder();
while( ( line = reader.readLine() ) != null)
{
sb.append( line );
}
reader.close();
return sb.toString();
}
// Concat with +=
// - slow
// - slow
// - really slow
public static String concat(String filePath) throws IOException
{
BufferedReader reader = new BufferedReader(new FileReader(filePath));
String line, results = "";
int i = 0;
while( ( line = reader.readLine() ) != null)
{
results += line;
i++;
}
reader.close();
return results;
}
// Allocate byte[filesize]
// + seems to be the fastest for large files
// - only works if filesize is known in advance, so less versatile for a not significant performance gain
// + shortest code
public static String singleBuffer(String filePath ) throws IOException{
FileInputStream in = new FileInputStream( filePath );
byte buffer[] = new byte[(int) new File( filePath).length()]; // buffer for the entire file
int len = in.read( buffer );
return new String( buffer, 0, len );
}
}
/**
*** RESULTS ***
Reading 1k.txt (~31Kbytes)
Warm up... : 0 ms
> Concat with += : 37 ms
> StringBuilder init with length : 0 ms
> StringBuilder with line buffer : 0 ms
> StringBuilder with char[] buffer: 0 ms
> StringBuffer with char[] buffer : 0 ms
> Allocate byte[filesize] : 1 ms
Reading 10k.txt (~313Kbytes)
Warm up... : 0 ms
> Concat with += : 708 ms
> StringBuilder init with length : 2 ms
> StringBuilder with line buffer : 2 ms
> StringBuilder with char[] buffer: 1 ms
> StringBuffer with char[] buffer : 1 ms
> Allocate byte[filesize] : 1 ms
Reading 100k.txt (~3136Kbytes)
Warm up... : 7 ms
> Concat with += **skipped** : 0 ms
> StringBuilder init with length : 19 ms
> StringBuilder with line buffer : 21 ms
> StringBuilder with char[] buffer: 9 ms
> StringBuffer with char[] buffer : 9 ms
> Allocate byte[filesize] : 8 ms
Reading 2142k.txt (~67204Kbytes)
Warm up... : 181 ms
> Concat with += **skipped** : 0 ms
> StringBuilder init with length : 367 ms
> StringBuilder with line buffer : 372 ms
> StringBuilder with char[] buffer: 208 ms
> StringBuffer with char[] buffer : 202 ms
> Allocate byte[filesize] : 199 ms
Reading pruned-names.csv (~11200Kbytes)
Warm up... : 23 ms
> Concat with += **skipped** : 0 ms
> StringBuilder init with length : 54 ms
> StringBuilder with line buffer : 57 ms
> StringBuilder with char[] buffer: 32 ms
> StringBuffer with char[] buffer : 31 ms
> Allocate byte[filesize] : 32 ms
Reading /Users/hansi/Downloads/xcode46graphicstools6938140a.dmg (~123429Kbytes)
Warm up... : 1665 ms
> Concat with += **skipped** : 0 ms
> StringBuilder init with length : 2899 ms
> StringBuilder with line buffer : 2978 ms
> StringBuilder with char[] buffer: 2702 ms
> StringBuffer with char[] buffer : 2684 ms
> Allocate byte[filesize] : 1567 ms
**/
附言。您可能已经注意到 StringBuffer 比 StringBuilder 稍快。这有点胡说八道,因为类是相同的,除了 StringBuilder 不同步。如果有人可以(或)不能重现这个......我很好奇:)