使用 JSoup 框架,我试图遍历下面的 div 并将每个<p>
标签中的文本提取到一个数组中。因为<div>
' 和<p>
' 的列表是无限长的,所以 do/while 循环或 for 循环将是获取 中信息的首选方法<p>
。
我不知道如何遍历<div>
下面的标签,因为我不确定如何跟踪我将哪些<p>
标签<div>
存储到数组中。如果答案很明显,我深表歉意,因为我对 Java 和一般编程有点陌生。
非常感谢你的帮助。让我知道我是否可以添加任何对您有帮助的内容。
示例 HTML(假设重复数百次):
<div class="happy-div"> // want everything within this div to be in one array element
<p>good text here.</p>
<p>More good Text here.</p>
<p>Some good stuff here.</p>
</div>
<div class="sad-div"> // want everything within this div to be in a separate array element
<p>Some unhappy text here.</p>
<p>More unhappy Text here.</p>
<p>Some unhappy stuff here.</p>
</div>
<div class="depressed-div"> // everything within this div to be in a separate array element
<p>Some melancholy text here.</p>
<p>More melancholy Text here.</p>
<p>Some melancholy stuff here.</p>
</div>
.... repeats hundreds of times
伪代码:
String[] arrayOfP;
for (int i = 0; i < numberOfDivs; i++)
{
arrayOfP[i] = doc.select("All of the text in the <p> tags within the div we've incremented to")
System.out.println(arrayOfP[i])
}
预期结果:
在打印字符串数组元素值的内容时,我希望看到:
arrayofP[1] Some good text here. More good Text Here. Some good stuff here.
arrayofP[2] Some unhappy text here. More unhappy Text Here. Some unhappy stuff here.
arrayofP[3] Some melancholy text here. More melancholy Text Here. Some melancholy stuff here.
....