java - xpath：写入文件

Question

我正在开发 Java 代码以从网站获取数据并将其存储在文件中。我想将 xpath 的结果存储到文件中。有没有办法保存xpath的输出？如有错误请见谅；这是我的第一个问题。

public class TestScrapping {

public static void main(String[] args) throws MalformedURLException, IOException, XPatherException {

    // URL to be fetched in the below url u can replace s=cantabil with company of ur choice
    String url_fetch = "http://www.yahoo.com";

    //create tagnode object to traverse XML using xpath
    TagNode node;
    String info = null;

    //XPath of the data to be fetched.....use firefox's firepath addon or use firebug to fetch the required XPath.
    //the below XPath will display the title of the company u have queried for
    String name_xpath = "//div[1]/div[2]/div[2]/div[1]/div/div/div/div/table/tbody/tr[1]/td[2]/text()";

     // declarations related to the api
    HtmlCleaner cleaner = new HtmlCleaner();
    CleanerProperties props = new CleanerProperties();
    props.setAllowHtmlInsideAttributes(true);
    props.setAllowMultiWordAttributes(true);
    props.setRecognizeUnicodeChars(true);
    props.setOmitComments(true);


    //creating url object
    URL url = new URL(url_fetch);
    URLConnection conn = url.openConnection(); //opening connection
    node = cleaner.clean(new InputStreamReader(conn.getInputStream()));//reading input stream

    //storing the nodes belonging to the given xpath
    Object[] info_nodes = node.evaluateXPath(name_xpath);
   // String li= node.getAttributeByName(name_xpath);


//checking if something returned or not....if XPath invalid info_nodes.length=0
    if (info_nodes.length > 0) {

        //info_nodes[0] will return string buffer
        StringBuffer str = new StringBuffer();
        {
            for(int i=0;i<info_nodes.length;i++)
                System.out.println(info_nodes[i]);
        }
        /*str.append(info_nodes[0]);
        System.out.println(str);
 */
     }

 }
 }

score 1 · Accepted Answer

您可以“简单地”将节点作为字符串打印到控制台/或文件中——Perl 中的示例：

my $all = $XML_OBJ->find('/');    # selecting all nodes from root
foreach my $node ($all->get_nodelist()) {
    print XML::XPath::XMLParser::as_string($node);
}

注意：此输出可能不是很好的 xml 格式/缩进

score 0 · Accepted Answer

Java 中 XPath 的输出是一个节点集，所以是的，一旦你有了一个节点集，你就可以用它做任何你想做的事情，将它保存到一个文件中，再对其进行一些处理。

将其保存到文件将涉及 java 中与将其他任何内容保存到文件所涉及的相同的步骤，这与任何其他数据之间没有区别。选择节点集，遍历它，从中获取所需的部分并将它们写入某种文件流。

但是，如果您的意思是有一个 Nodeset.SaveToFile()，那么没有。

score 0 · Accepted Answer

我建议您使用 NodeSet，它是节点的集合，对其进行迭代，并将其添加到创建的 DOM 文档对象中。
在此之后，您可以使用TransformerFactory获取一个 Transformer 对象，并使用它的 transform 方法。您应该从DOMSource转换为可以基于 FileOutputStream 创建的StreamResult对象。

java - xpath：写入文件

3 回答 3

Related

Reference