1

我是使用 Selenium 在网站上执行 Web 自动化的新手,我在提取两个 div 标签之间的文本时遇到了麻烦。

这是我试图从中提取文本的 HTML 代码的片段。

 ...
<tr>
    <td width="150">
    <a href="https://rads.stackoverflow.com/amzn/click/com/B0099RGRT8" rel="nofollow noreferrer">
    <img height="90" border="0" width="90" alt="iOttie Easy Flex2 Windshield Dashboard Car Mount H&hellip by iOttie" src="http://ecx.images-amazon.com/images/I/51mf6Ry9J2L._SL500_SS90_.jpg">
    </a>
    <div class="xxsmall" style="margin-top: 5px">
        <a href="https://rads.stackoverflow.com/amzn/click/com/B0099RGRT8" rel="nofollow noreferrer">iOttie Easy Flex2 Windshield Dashboard Car Mount Holder Desk Stand for iPhone 5 4S 4 3GS Samsung Gal&amp;hellip</a>
        by iOttie
    </div>
    </td>
    <td style="padding-left: 10px;">
        <div>
            <div>
                <span style="margin-left:-5px; vertical-align: -1">

                </span>
                <b>
                <a href="http://www.amazon.com/gp/cdp/member-reviews/A2UQ07EFPSX78X/ref=cm_pdp_rev_title_1?ie=UTF8&sort_by=MostRecentReview#R12ATB4KTIWFV8">Bought for my wife, now I want one. Excellent Product.</a>
                </b>
                ,
                <span class="nowrap">November 30, 2012</span>
            </div>
            <div style="margin-top: 5px;">
                I bought this mount for my wife, the feedback from her was is that it was really nice and easy to use even while driving.
                <br>
                <br>
                So I "borrowed" it for a couple days, and now I am going to get one for myself. I am using it with an iPhone, but it would work fine with phones of all sizes, which is nice. If my phone size ever changes the mount will accommodate different sizes phones.
                <br>
                <br>
                The phone is very easy to insert and remove , even while driving.
                <br>
                The mount is easy to position but not loose enough that it doesn't hold the position you want.
                <br>
                <br>
                I was very impressed with the windshield mount, it is not just a typical suction cup mount. (Which always at some point…
                <a href="http://www.amazon.com/gp/cdp/member-reviews/A2UQ07EFPSX78X/ref=cm_pdp_rev_more?ie=UTF8&sort_by=MostRecentReview#R12ATB4KTIWFV8">Read more</a>
            </div>
        </div>
    </td>
</tr>
...

其他 div 标签实际上也包含其他文本。

我想从中提取的是:我为我的妻子买了这个坐骑,她的反馈是它非常好用,即使在开车时也很容易使用。

            I bought this mount for my wife, the feedback from her was is that it was really nice and easy to use even while driving.

            So I "borrowed" it for a couple days, and now I am going to get one for myself. I am using it with an iPhone, but it would work fine with phones of all sizes, which is nice. If my phone size ever changes the mount will accommodate different sizes phones.

            The phone is very easy to insert and remove , even while driving.

            The mount is easy to position but not loose enough that it doesn't hold the position you want.

            I was very impressed with the windshield mount, it is not just a typical suction cup mount. (Which always at some point…

这是我的代码:

String review;
try {
    review = WebElement.bucketElement.findElement(By.xpath("./td/div")).getText();
} catch (NoSuchElementException nsee) {
    review = "NA";
}

这实际上从所有最里面的 div 标签中提取了所有文本,这不是我想要的。我可以使用特定的 div 标签来定位,./td/div/div[3]但我无法在 div 标签之间获取文本。

有什么想法吗?

谢谢

4

2 回答 2

1

您可以使用正则表达式作为解决方法:

String review;
try {
    review = WebElement.bucketElement.findElement(By.xpath("./td/div")).getText();
    review.replaceAll("(<.+>)", "");
} catch (NoSuchElementException nsee) {
    review = "NA";
}

正则表达式删除所有标签和内部元素文本。只剩下第一级文本。这意味着如果您有:

some strange<div>other text</div> text 结果字符串将是:some strange text

如果您需要更复杂的正则表达式,这里是有用的链接来测试它

于 2013-03-28T07:19:16.163 回答
0

使用 /td/div/div[3] 找到元素后,如果您在此 web 元素中执行 getText(),它将返回此 div/元素中的文本。

于 2013-04-01T07:11:24.887 回答