这是我要提取单词的 HTML 文件(待定,下一个上市日期(可能):,10/01/2014)。我正在使用 jaunt 和 JSoup。
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Language" content="en-us"/>
<meta http-equiv="Content-Type" content="text/html;url=http://allahabadhighcourt.in/casestatus/utf-8"/>
<title>Case Status Result</title>
<link REL="StyleSheet" href="http://allahabadhighcourt.in/alldhc.css" TYPE="text/css"/>
<script src="http://allahabadhighcourt.in/alldhc.js" LANGUAGE="JavaScript" TYPE="text/javascript">
<!--
-->
</script>
</head>
<body onLoad="bodyOnLoad()">
<div CLASS="heading">
<img BORDER="0" src="http://allahabadhighcourt.in/image/titleEN.gif" WIDTH="532" HEIGHT="30" ALT="HIGH COURT OF JUDICATURE AT ALLAHABAD"/>
</div>
<h4 CLASS="subheading" ALIGN="center" STYLE="margin-top: 6pt; margin-bottom: 0pt">Case Status - Allahabad</h4>
<p ALIGN="center" STYLE="margin-top: 0; margin-bottom: 6pt">
<img BORDER="0" src="http://allahabadhighcourt.in/image/blueline.gif" WIDTH="210" HEIGHT="1"/></p>
<table ALIGN="center" CLASS="withb" WIDTH="60%" COLS="2">
<tr><td VALIGN='top' COLSPAN='2' ALIGN='right' STYLE='font-size: 18pt'>Pending</td></tr><tr><td VALIGN='top' ALIGN='center' COLSPAN='2' STYLE='font-size: 16pt'>Criminal Misc. Bail Application : 12898 of 2013 [Etah]</td></tr><tr><td VALIGN='top' WIDTH='35%' STYLE='font-size: 14pt'>Petitioner:</td><td STYLE='font-size: 14pt'>AVANISH</td></tr><tr><td VALIGN='top' WIDTH='35%' STYLE='font-size: 14pt'>Respondent:</td><td STYLE='font-size: 14pt'>STATE OF U.P.</td></tr><tr><td VALIGN='top' WIDTH='35%' STYLE='font-size: 14pt'>Counsel (Pet.):</td><td STYLE='font-size: 14pt'>SANJEEV MISHRA</td></tr><tr><td VALIGN='top' WIDTH='35%' STYLE='font-size: 14pt'>Counsel (Res.):</td><td STYLE='font-size: 14pt'>GOVT. ADVOCATE</td></tr><tr><td VALIGN='top' WIDTH='35%' STYLE='font-size: 14pt'>Category:</td><td VALIGN='top'>Criminal Jurisdiction Application-U/s 439, Cr.p.c., For Bail (major)</td></tr><tr><td VALIGN='top' WIDTH='35%' STYLE='font-size: 14pt'>Date of Filing:</td><td VALIGN='top' STYLE='font-size: 14pt'>08/05/2013</td></tr><tr><td WIDTH='35%' STYLE='font-size: 14pt'>Last Listed on:</td><td STYLE='font-size: 14pt'>03/01/2014 in Court No. 48</td></tr><tr><td WIDTH='35%' STYLE='font-size: 14pt'>Next Listing Date (Likely):</td><td STYLE='font-size: 14pt'>10/01/2014</td></tr><tr><td COLSPAN='2'></td></tr></table><p STYLE="text-align: justify; margin-top: 16pt; margin-left: 90pt; margin-right: 90pt; font-size: 10pt">This is not an authentic/certified copy of the information regarding status of a case. Authentic/certified information may be obtained under Chapter VIII Rule 30 of Allahabad High Court Rules. Mistake, if any, may be brought to the notice of OSD (Computer).</p>
<table ALIGN="center" WIDTH="80%" COLS="1" RULES="NONE" BORDER="0" STYLE="margin-top: 16pt">
<tbody>
<tr ALIGN="center" VALIGN="TOP">
<td VALIGN="TOP" ALIGN="center">
<img ALT="Back" src="http://allahabadhighcourt.in/image/back.gif" WIDTH="30" HEIGHT="25" BORDER="0" onClick="location.href='indexA.html'" STYLE="cursor:pointer"/>
</td>
</tr>
</tbody>
</table>
</body>
</html>