我正在创建一个亚马逊抓取,但是有一个循环问题。这是我的代码:
import pandas as pd
import os
import csv
from selenium.webdriver.common.action_chains import ActionChains
os.chdir(r'C:\Users\ACER\Desktop\test sheet')
df = pd.read_excel('test.xlsx')
from selenium import webdriver
path = r'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(path)
for i in df.index:
sheet = df.loc[i]
driver.get((sheet['IDNumber']))
title = driver.find_element_by_xpath('//*[@id="productTitle"]').text
print(title)
try:
price = driver.find_element_by_xpath('//*[@id="priceblock_ourprice"]').text
print(price)
except:
price = ''
try:
category = driver.find_element_by_xpath('//*[@id="wayfinding-breadcrumbs_feature_div"]/ul/li[3]/span/a').text
print(category)
except:
category = ''
description = driver.find_element_by_xpath('//*[@id="productDescription"]').text
print(description)
for j in driver.find_elements_by_css_selector('#altImages .imageThumbnail'):
hover = ActionChains(driver).move_to_element(j)
hover.perform()
i_link = driver.find_element_by_css_selector('.image.item.maintain-height.selected img').get_attribute('src')
print(i_link)
data = [[title, price, category, description, i_link]]
a = pd.DataFrame(data)
a.to_excel('a.xlsx')
面临的问题:
- 为 2 个 ASIN 运行 python,但只有 1 个 ASIN 的详细信息被转换为 csv。
身份证号 https://www.amazon.ae/dp/B08J8181F9 https://www.amazon.ae/dp/B07MX9L4TR
- 标题、价格、类别、描述、i_link(主要/主要图像)被转换为 excel,但次要图像除外。
示例: https ://www.amazon.ae/dp/B07MX9L4TR
**缺少https://www.amazon.ae/dp/B08J8181F9的详细信息 ** https ://www.amazon.ae/dp/B07MX9L4TR 的 第二张图片未转换
