0

我正在创建一个亚马逊抓取,但是有一个循环问题。这是我的代码:

import pandas as pd
import os
import csv
from selenium.webdriver.common.action_chains import ActionChains



os.chdir(r'C:\Users\ACER\Desktop\test sheet')
df = pd.read_excel('test.xlsx')


from selenium import webdriver
path = r'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(path)


for i in df.index:
    sheet = df.loc[i]
    driver.get((sheet['IDNumber']))
    title = driver.find_element_by_xpath('//*[@id="productTitle"]').text
    print(title)
    try:
        price = driver.find_element_by_xpath('//*[@id="priceblock_ourprice"]').text
        print(price)
    except:
        price = ''
    try:
        category = driver.find_element_by_xpath('//*[@id="wayfinding-breadcrumbs_feature_div"]/ul/li[3]/span/a').text
        print(category)
    except:
        category = ''
    description = driver.find_element_by_xpath('//*[@id="productDescription"]').text
    print(description)
    for j in driver.find_elements_by_css_selector('#altImages .imageThumbnail'):
        hover = ActionChains(driver).move_to_element(j)
        hover.perform()
        i_link = driver.find_element_by_css_selector('.image.item.maintain-height.selected img').get_attribute('src')
        print(i_link)
    data = [[title, price, category, description, i_link]]
    a = pd.DataFrame(data)
    a.to_excel('a.xlsx')

面临的问题:

  1. 为 2 个 ASIN 运行 python,但只有 1 个 ASIN 的详细信息被转换为 csv。

身份证号 https://www.amazon.ae/dp/B08J8181F9 https://www.amazon.ae/dp/B07MX9L4TR

  1. 标题、价格、类别、描述、i_link(主要/主要图像)被转换为 excel,但次要图像除外。

示例: https ://www.amazon.ae/dp/B07MX9L4TR

转换为excel的详细信息: 在此处输入图像描述

**缺少https://www.amazon.ae/dp/B08J8181F9的详细信息 ** https ://www.amazon.ae/dp/B07MX9L4TR 的 第二张图片未转换

4

0 回答 0