1

我正在尝试从以下 html 获取项目的过程。

这是src

 <span class="crwActualPrice">
        <span style="text-decoration: inherit; white-space: nowrap;">
            <span class="currencyINR">
                &nbsp;&nbsp;
            </span>
            <span class="currencyINRFallback" style="display:none">
                Rs. 
            </span>
            13,990.00
        </span>
    </span>

这是我尝试过的代码

    dprice = each_result.find_all("span", class_="crwActualPrice")
        for each_price in dprice:
            money_str = each_price.string
            print(money_str)

我想在money_str使用 python 汤时获得 13990 的值。

4

3 回答 3

0

这应该有效。尽管由于数据集有限,我并不是 100% 了解边缘情况

In [1]: from bs4 import BeautifulSoup
In [2]: s = BeautifulSoup(''' <span class="crwActualPrice">
    ...:         <span style="text-decoration: inherit; white-space: nowrap;">
    ...:             <span class="currencyINR">
    ...:                 &nbsp;&nbsp;
    ...:             </span>
    ...:             <span class="currencyINRFallback" style="display:none">
    ...:                 Rs.
    ...:             </span>
    ...:             13,990.00
    ...:         </span>
    ...:     </span>''')

In [3]: for each in s.select('span.crwActualPrice'):
   ...:     print(each.get_text().strip().replace(' ','').replace('\n', ''))
于 2019-07-19T11:25:34.610 回答
0

soup.select功能:

from bs4 import BeautifulSoup

html_data = '''<span class="crwActualPrice">
        <span style="text-decoration: inherit; white-space: nowrap;">
            <span class="currencyINR">
                &nbsp;&nbsp;
            </span>
            <span class="currencyINRFallback" style="display:none">
                Rs. 
            </span>
            13,990.00
        </span>
    </span>'''

soup = BeautifulSoup(html_data, 'html.parser')
for curr in soup.select("span.crwActualPrice span.currencyINRFallback"):
    price = curr.nextSibling.strip()
    print(price)

印刷:

13,990.00
于 2019-07-19T11:19:50.880 回答
0

使用获取 div 之外的内容text()

...
dprice = each_result.find_all("span", class_="crwActualPrice")
for each_price in dprice:
    money_str += reach_price.text()
print(money_str.strip('&nbsp;'))
于 2019-07-19T11:23:54.433 回答