python - 如何用 selenium python 删除/替换文本？

Question

我正在学习进行网络抓取，并设法将数据从网页中提取到 excel 文件中。但是，可能是因为项目名称包含“，”，这使得excel文件中的项目名称变为多列。

我曾尝试在列表中使用剥离和替换元素，但它返回错误消息：AttributeError: 'WebElement' object has no attribute 'replace'。

item = driver.find_elements_by_xpath('//h2[@class="list_title"]')
item = [i.replace(",","") for i in item]
price = driver.find_elements_by_xpath('//div[@class="ads_price"]')
price = [p.replace("rm","") for p in price]

excel文件文件中的预期结果：excel文件文件中的 预期
实际结果： 实际

score 0 · Accepted Answer

您包含在问题中的代码部分不是与您遇到的问题相关的部分。

正如 CMMCD 所提到的，为了简单起见，我还建议跳过二进制 excel 格式，而是使用内置的 csv 库。这将防止意外的分隔符拆分您的单元格

from csv import writer

# your data should be a list of lists
data = [['product1', 8.0], ['product2', 12.25]]  # etc, as an example

with open('your_output_file.csv', 'w') as file:
    mywriter = writer(file)
    for line in data:
        mywriter.writerow(line)

文档：https ://docs.python.org/3/library/csv.html

score 0 · Accepted Answer

函数 find_elements_by_xpath 返回一个 WebElement 对象，您需要将其转换为字符串才能使用替换函数。

根据您的用例，您可能需要重新考虑使用 excel 作为存储介质，除非这是您流程的最后一步。

python - 如何用 selenium python 删除/替换文本？

2 回答 2

Related

Reference