为什么以下代码会更改“öäüß”?(我正在使用它将大文件分成多个小文件......)
InputStream is = new BufferedInputStream(new FileInputStream(file));
File newFile;
BufferedWriter bw;
newFile = new File(filePathBase + "." + String.valueOf(files.size() + 1) + fileExtension);
files.add(newFile);
bw = new BufferedWriter(new FileWriter(newFile));
try {
byte[] c = new byte[1024];
int lineCount = 0;
int readChars = 0;
while ( ( readChars = is.read(c) ) != -1 )
for ( int i=0; i<readChars; i++ ) {
bw.write(c[i]);
if ( c[i] == '\n' )
if ( ++lineCount % linesPerFile == 0 ) {
bw.close();
newFile = new File(filePathBase + "." + String.valueOf(files.size() + 1) + fileExtension);
files.add(newFile);
bw = new BufferedWriter(new FileWriter(newFile));
}
}
} finally {
bw.close();
is.close();
}
我对字符编码的理解是,只要我保持每个字节相同,一切都应该保持不变。为什么这段代码会改变字节?
先多谢了~
==================== 解决方案=====================
错误在于FileWriter
解释字节并且不应该仅用于输出纯字节,感谢@meriton 和@Jonathan Rosen。只是将所有内容更改为BufferedOutputStream
都不会这样做,因为BufferedOutputStream
太慢了!我最终改进了我的文件拆分和复制代码,以包含更大的读取数组大小,并且仅write()
在必要时...
File newFile = new File(filePathBase + "." + String.valueOf(files.size() + 1) + fileExtension);
files.add(newFile);
InputStream iS = new BufferedInputStream(new FileInputStream(file));
OutputStream oS = new FileOutputStream(newFile); // BufferedOutputStream wrapper toooo slow!
try {
byte[] c;
if ( linesPerFile > 65536 )
c = new byte[65536];
else
c = new byte[1024];
int lineCount = 0;
int readChars = 0;
while ( ( readChars = iS.read(c) ) != -1 ) {
int from = 0;
for ( int idx=0; idx<readChars; idx++ )
if ( c[idx] == '\n' && ++lineCount % linesPerFile == 0 ) {
oS.write(c, from, idx+1 - from);
oS.close();
from = idx+1;
newFile = new File(filePathBase + "." + String.valueOf(files.size() + 1) + fileExtension);
files.add(newFile);
oS = new FileOutputStream(newFile);
}
oS.write(c, from, readChars - from);
}
} finally {
iS.close();
oS.close();
}