0

at the moment I'm having a problem with writing a tool for my company. I have 384 XML files that i have to read and parse with a SAX Parser into txt files. What i got until now is the parsing of all XML-Files into one txt File, size 43 MB. With a BufferedReader and line.startsWith i want to extract all relevant information out of the textfile.

Edit: Done (So my Problem is how to solve this more efficiently. I'm having an idea (but unfortunately not in code as you might think) but i dont know if its possible: I want to iterate through a Directory, find the XML-File i want, then parse it and create a new txt File with the parsed content. If done for all 384 XML files i want the same thing for the 384 txt files, read them with a BufferedReader to get my relevant information. Its important to read them one at a time. Another Problem is the Directory path, its a bit complex: "C:\Users\xxx\Documents\Data\ProjectName\A1\1\1SLin\wanted.xml" for each file there is a own directory. The variable is A1, it reaches from A-P and 1-24. Alternatively I have all the relevant files with thir absolute path in an arraylist, so its also okay to iterate over this list if its easier.)

Edit: I came to a solution: Below contains the search directories method and a method to parse the xml Files of a List into the same directory with the same filename but another file extension

public List<File> searchFile(File dir, String find) {

    File[] files = dir.listFiles();
    List<File> matches = new ArrayList<File>();
    if (files != null) {
        for (int i = 0; i < files.length; i++) {
            if (files[i].isDirectory()) {
                matches.addAll(searchFile(files[i], find));
            } else if (files[i].getName().equalsIgnoreCase(find)) {
                matches.add(files[i]);
            }
        }
    }
    Collections.sort(matches);
    return matches;

}

public static void main(String[] args) throws IOException {

    Import_Files im = new Import_Files();
    File dir = new File("C:\\Users\\xxx\\Desktop\\MS-Daten\\");
    String name = "snp_result_5815.xml";
    List<File> matches = im.searchFile(dir, name);
    System.out.println(matches);

    for (int i=0; i<matches.size(); i++) {
        String j = String.valueOf(i);
        String xml_name = matches.get(i).getAbsolutePath();
        File f = new File(matches.get(i).getAbsolutePath().replaceFirst(".xml", ".txt"));
        System.setOut(new PrintStream(new FileOutputStream(f)));

        System.out.println("\nstarting File: "+ i + "\n");
        xml_parse myReader = new xml_parse(xml_name);
        myReader.setContentHandler(new MyContentHandler());
        myReader.setErrorHandler(new MyErrorHandler());
        myReader.run();
    }

}
4

1 回答 1

0

下面的searchFolder方法将采用路径和文件扩展名,搜索路径和所有子目录,并将任何匹配的文件类型传递给该processFile方法。

public static void main(String[] args) {
    String path = "c:\\temp";
    Pattern filePattern = Pattern.compile("(?i).*\\.xml$");
    searchFolder(path, filePattern);
}

public static void searchFolder(String searchPath, Pattern filePattern){
    File dir = new File(searchPath);
    for(File item : dir.listFiles()){
        if(item.isDirectory()){
            //recursively search subdirectories
            searchFolder(item.getAbsolutePath(), filePattern);
        } else if(item.isFile() && filePattern.matcher(item.getName()).matches()){
            processFile(item);
        }
    }
}

public static void processFile(File aFile){
    String filename = aFile.getAbsolutePath();
    String txtFilename = filename.substring(0, filename.lastIndexOf(".")) + ".txt";
    //Do your xml file parsing and write to txtFilename
}

路径的复杂性没有区别,只需指定要搜索的根路径(看起来像C:\Users\xxx\Documents\Data\ProjectName您的情况),它就会找到所有文件。

于 2013-08-13T12:44:07.137 回答