通常,您有一个要从中提取数据的 HTML 文档。您大致了解 HTML 文档的结构。
有几个解析器库,但最好的一个是Jsoup,您可以使用 DOM 方法来导航文档和更新值。在您的情况下,您需要读取文件并使用属性设置器方法。
示例 XHTML 文件:
<?xml version="1.0" encoding="UTF-8"?>
<!--
To change this license header, choose License Headers in Project Properties.
To change this template file, choose Tools | Templates
and open the template in the editor.
-->
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Example</title>
</head>
<body>
<p id="content">Hello World</p>
</body>
</html>
Java代码:
File input = new File("D:\\Projects\\Odata Project\\Odata\\src\\web\\html\\inscription_template.xhtml");
org.jsoup.nodes.Document doc = Jsoup.parse(input,null);
org.jsoup.nodes.Element content = doc.getElementById("content");
System.out.println(content.text("Hi How are you ?"));
System.out.println(content.text());
System.out.println(doc);
执行后输出:
<p id="content">Hi How are you ?</p>
Hi How are you ?
<!--?xml version="1.0" encoding="UTF-8"?-->
<!--
To change this license header, choose License Headers in Project Properties.
To change this template file, choose Tools | Templates
and open the template in the editor.
--><!doctype html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Example</title>
</head>
<body>
<p id="content">Hi How are you ?</p>
</body>
</html>