我的工作区中保存了 4 个文件 a、b、c、d 中的一堆 id。我想将所有这些 id 按排序顺序合并到一个文件merged.txt 中。它们将每行保存一个作为字符串。我可以通过将文件放入内存来单独对文件进行排序。但是如何合并它们,可能会有重复的条目。我想不出如何比较四个文件中的每个条目(它们可以增长到 8,所以不能硬编码)。特别是如何比较条目以及如何仅推进那些最小的文件指针?
public void sortFile() throws IOException
{
File a = new File("/Users/phoenix/workspace/data/a.txt");
File b = new File("/Users/phoenix/workspace/data/b.txt");
File c = new File("/Users/phoenix/workspace/data/c.txt");
File d = new File("/Users/phoenix/workspace/data/d.txt");
doSort(a);
doSort(b);
doSort(c);
doSort(d);
merge();
}
如何根据下面的伪代码修改合并方法?
public void merge()
{
File dir = new File("/Users/phoenix/workspace/data");
for(File f: dir.listFiles())
{
// toDo: merge into a single file merged.txt
}
}
public void doSort(File f) throws IOException
{
BufferedReader reader = new BufferedReader(new FileReader(f));
String line;
ArrayList<String> list = new ArrayList<String>();
while((line = reader.readLine())!=null)
{
list.add(line);
}
Collections.sort(list);
PrintWriter out = new PrintWriter(f);
for(String s:list)
out.println(s);
reader.close();
out.close();
}
public void merge() throws IOException
{
File dir = new File("/Users/phoenix/workspace/data");
File merged = new File("/Users/phoenix/workspace/data/merged.txt");
ArrayList<BufferedReader> readers = new ArrayList<BufferedReader>(dir.listFiles().length);
ArrayList<String> list = new ArrayList<String>();
PrintWriter out = new PrintWriter(merged);
for(File f: dir.listFiles())
{
readers.add(new BufferedReader(new FileReader(f)));
}
while(true)
{
for (BufferedReader reader: readers)
{
if(reader.readLine()!=null)
list.add(reader.readLine());
else
{
reader.close();
}
}
String min = Collections.min(list);
int index = list.indexOf(min);
out.write(min);
}
}