场景:在网页中,几乎没有表格。我想完整阅读其中一张表,但问题是只显示了 10 行,其余的我必须向下滚动。实际上最初只有 10 行可用,当我们滚动时渲染其余的字段。为了解决这个问题,我想我会模拟按键并继续阅读,但问题是即使 xpath 也不一致,所以也不能循环。几个不同单元格的 Xpath 是:
html/body/div[2]/form/div[1]/div[2]/div/div[2]/div/div[2]/div/div[2]/div/div[2]/div/span/div[1]/div/div/div/div[2]/div/div[2]/table/tbody/tr[1]/td[1]
html/body/div[2]/form/div[1]/div[2]/div/div[2]/div/div[2]/div/div[2]/div/div[2]/div/span/div[1]/div/div/div/div[2]/div/div[2]/table[2]/tbody/tr[3]/td[1]
html/body/div[1]/form/div[1]/div[2]/div/div[2]/div/div[2]/div/div[2]/div/div[2]/div/span/div[1]/div/div/div/div[2]/div/div[2]/table[1]/tbody/tr[10]/td[1]
我可以通过什么方式获取所有单元格中的数据?
HTML 源代码:
<div id="pt1:ph1" class="x6v">
<table class="xtb" width="0" cellspacing="0" cellpadding="0" border="0" summary="">
<div class="xvp">
<div>
<div id="pt1:pc1" class="xpb xph" _afrclmwmn="['c1','c2','c3','c4','c5','c6','c7','c8','c9','c10']" _afrac="pt1:pc1:md1" style="height: 282px;">
<div id="pt1:pc1::_ahTp" style="height:auto">
<div id="pt1:pc1::_ahCt">
<div id="pt1:pc1:md1" class="xpj xpb" _leafcolclientids="['pt1:pc1:md1:c1','pt1:pc1:md1:c2','pt1:pc1:md1:c3','pt1:pc1:md1:c4','pt1:pc1:md1:c5','pt1:pc1:md1:c6','pt1:pc1:md1:c7','pt1:pc1:md1:c8','pt1:pc1:md1:c9','pt1:pc1:md1:c10','pt1:pc1:md1:c11','pt1:pc1:md1:c12']" _afrfilterable="true" _afrautohr="10" _afrhcc="0" _afrpcid="pt1:pc1" tabindex="0" style="height: 234px;">
<div id="pt1:pc1:md1::ch" class="xz5" _afrcolcount="12" style="overflow: hidden; position: relative; width: 753px;">
<div id="pt1:pc1:md1::db" class="xyy" _afrcolcount="12" style="position: relative; width: 753px; overflow: hidden; height: 170px; z-index: 1;">
<table class="xyz xzr" cellspacing="0" _startrow="0" _rowcount="44" _selstate="{'0':true}" _totalwidth="1260" style="table-layout:fixed;position:relative;width:1260px;">
<tbody>
<tr class="xzn p_AFSelected" _afrrk="0">
<td class="xzk" nowrap="" style="width:100px;">12</td>
<td class="xzk" nowrap="" style="width:100px;">A12</td>
<td class="xzk" nowrap="" style="width:100px;">B12</td>
<td class="xzk" nowrap="" style="width:100px;">B12</td>
<td class="xzk" nowrap="" style="width:100px;">C12</td>
<td class="xzk" nowrap="" style="width:100px;">D12</td>
<td class="xzk" nowrap="" style="width:100px;"> </td>
<td class="xzk" nowrap="" style="width:100px;">K12</td>
<td class="xzk" nowrap="" style="width:100px;"> </td>
<td class="xzk" nowrap="" style="width:100px;"> </td>
<td class="xzk" nowrap="" style="width:100px;"> </td>
<td class="xzk" nowrap="" style="width:100px;">G12</td>
</tr>
<tr class="xzn" _afrrk="1">
<tr class="xzn" _afrrk="2">
<tr class="xzn" _afrrk="3">
<tr class="xzn" _afrrk="4">
<tr class="xzn" _afrrk="5">
<tr class="xzn" _afrrk="6">
<tr class="xzn" _afrrk="7">
<tr class="xzn" _afrrk="8">
<tr class="xzn" _afrrk="9">
</tbody>
</table>
</div>
<div id="pt1:pc1:md1::sm" class="xzu" style="position:absolute;display:none"></div>
<div id="pt1:pc1:md1::ri" class="xz0" style="position:absolute;display:none;overflow:hidden"></div>
<div id="pt1:pc1:md1::dataW" style="display:none"></div>
<div id="pt1:pc1:md1::scroller" tabindex="-1" style="position: absolute; overflow: auto; z-index: 0; width: 770px; top: 46px; height: 187px; right: 0px;">
</div>
</div>
<div id="pt1:pc1::_ahBt" style="height:auto">
<div id="pt1:pc1:_clmCxt" style="display:none">
<div id="pt1:pc1:_PCPop" style="display:none">
<div id="pt1:pc1::_dchDlgC" style="display:none">
</div>
</div>
</div>
</div>
滚动后的 HTML。
<div id="pt1:ph1" class="x6v">
<table class="xtb" width="0" cellspacing="0" cellpadding="0" border="0" summary="">
<div class="xvp">
<div>
<div id="pt1:pc1" class="xpb xph" _afrclmwmn="['c1','c2','c3','c4','c5','c6','c7','c8','c9','c10']" _afrac="pt1:pc1:md1" style="height: 282px;">
<div id="pt1:pc1::_ahTp" style="height:auto">
<div id="pt1:pc1::_ahCt">
<div id="pt1:pc1:md1" class="xpj xpb" _leafcolclientids="['pt1:pc1:md1:c1','pt1:pc1:md1:c2','pt1:pc1:md1:c3','pt1:pc1:md1:c4','pt1:pc1:md1:c5','pt1:pc1:md1:c6','pt1:pc1:md1:c7','pt1:pc1:md1:c8','pt1:pc1:md1:c9','pt1:pc1:md1:c10','pt1:pc1:md1:c11','pt1:pc1:md1:c12']" _afrfilterable="true" _afrautohr="10" _afrhcc="0" _afrpcid="pt1:pc1" tabindex="0" style="height: 234px;">
<div id="pt1:pc1:md1::ch" class="xz5" _afrcolcount="12" style="overflow: hidden; position: relative; width: 753px;">
<div id="pt1:pc1:md1::db" class="xyy" _afrcolcount="12" style="position: relative; width: 753px; overflow: hidden; height: 170px; z-index: 1;">
<table class="xyz xzr" cellspacing="0" _startrow="10" _rowcount="44" style="table-layout:fixed;position:relative;width:1260px;">
<tbody>
<tr class="p_AFFocused p_AFSelected xzn" _afrrk="10">
<tr class="xzn" _afrrk="11">
<tr class="xzn" _afrrk="12">
<tr class="xzn" _afrrk="13">
<tr class="xzn" _afrrk="14">
<tr class="xzn" _afrrk="15">
<tr class="xzn" _afrrk="16">
<tr class="xzn" _afrrk="17">
<tr class="xzn" _afrrk="18">
<tr class="xzn" _afrrk="19">
</tbody>
</table>
<table class="xyz xzr" cellspacing="0" _startrow="20" _rowcount="44" style="table-layout:fixed;position:relative;width:1260px;">
</div>
<div id="pt1:pc1:md1::sm" class="xzu" style="position: absolute; display: none; z-index: 5000; visibility: visible; top: 120px; right: 25px;">Fetching Data...</div>
<div id="pt1:pc1:md1::ri" class="xz0" style="position:absolute;display:none;overflow:hidden"></div>
<div id="pt1:pc1:md1::dataW" style="display:none"></div>
<div id="pt1:pc1:md1::scroller" tabindex="-1" style="position: absolute; overflow: auto; z-index: 0; width: 770px; top: 46px; height: 187px; right: 0px;">
</div>
</div>
<div id="pt1:pc1::_ahBt" style="height:auto">
<div id="pt1:pc1:_clmCxt" style="display:none">
<div id="pt1:pc1:_PCPop" style="display:none">
<div id="pt1:pc1::_dchDlgC" style="display:none">
</div>
</div>
</div>
</div>
<div id="pt1:ph2" class="x6v">
</span>
</div>
</div>