xpath - 通过使用 XPath 创建的响应进行解析

Question

使用 Scrapy，我想从 HTML 格式良好的网站中提取一些数据。使用 XPath，我可以提取项目列表，但我无法使用 XPath 从列表中的元素中提取额外数据

所有 XPath 都已使用 XPather 进行了测试。我已经使用包含网页的本地文件测试了这个问题，同样的问题。

开始：

# Get the webpage
fetch("https://www.someurl.com")

# The following gives me the expected items from the HTML
products = response.xpath("//*[@id='product-list-146620']/div/div")

物品是这样的：

<div data-pageindex="1" data-guid="13157582" class="col ">
  <div class="item item-card item-card--static">
    <div class="item-card__inner">
      <div class="item__image item__image--overlay">
        <a href="/www.something.anywhere?ref_gr=9801" class="ratio_custom" style="padding-bottom:100%">
        </a>
      </div>
      <div class="item__text-container">
        <div class="item__name">
          <a class="item__name-link" href="/c.aspx?ref_gr=9801">The text I want</a>
        </div>
      </div>
    </div>
  </div>
</div>

当使用以下 Xpath 提取“我想要的文本”时，我没有得到任何东西：

XPATH_PRODUCT_NAME = "/div/div/div/div/div[contains(@class,'item__name')]/a/text()"
products[0].xpath(XPATH_PRODUCT_NAME).extract()

输出为空，为什么？

score 0 · Accepted Answer

试试下面的代码。

XPATH_PRODUCT_NAME = ".//div[@class='item__name']/a[@class='item__name-link']/text()"
products[0].xpath(XPATH_PRODUCT_NAME).extract()

xpath - 通过使用 XPath 创建的响应进行解析

1 回答 1

Related

Reference