我需要处理一个大文本文件(大约 600 MB)才能正确格式化,将格式化的输出写入一个新的文本文件。问题在于将内容写入新文件时会停止在大约 6.2 MB 处。这是代码:
/* Analysis of the text in fileName to see if the lines are in the correct format
* (Theme\tDate\tTitle\tDescription). If there are lines that are in the incorrect format,
* the method corrects them.
*/
public static void cleanTextFile(String fileName, String destFile) throws IOException {
OutputStreamWriter writer = null;
BufferedReader reader = null;
try {
writer = new OutputStreamWriter(new FileOutputStream(destFile), "UTF8");
} catch (IOException e) {
System.out.println("Could not open or create the file " + destFile);
}
try {
reader = new BufferedReader(new FileReader(fileName));
} catch (FileNotFoundException e) {
System.out.println("The file " + fileName + " doesn't exist in the folder.");
}
String line;
String[] splitLine;
StringBuilder stringBuilder = new StringBuilder("");
while ((line = reader.readLine()) != null) {
splitLine = line.split("\t");
stringBuilder.append(line);
/* If the String array resulting of the split operation doesn't have size 4,
* then it means that there are elements of the news item missing in the line
*/
while (splitLine.length != 4) {
line = reader.readLine();
stringBuilder.append(line);
splitLine = stringBuilder.toString().split("\t");
}
stringBuilder.append("\n");
writer.write(stringBuilder.toString());
stringBuilder = new StringBuilder("");
writer.flush();
}
writer.close();
reader.close();
}
我已经在寻找答案,但问题通常与作者没有被关闭或没有flush()
方法有关。因此,我认为问题出在 BufferedReader 中。我错过了什么?