所以以前我在将我想要的信息从列表写入 csv 时遇到了一些问题。在给定以下代码的情况下,我在这里得到了一些帮助,并设法做我想做的事情:
import csv
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from webdriver_manager.chrome import ChromeDriverManager
# open target
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.exampleurl.com/facts')
# define xpath
def get_elements_by_xpath(driver, xpath):
return [entry.text for entry in driver.find_elements_by_xpath(xpath)]
# Xpaths
text_entries = [
("Fact 1", "//div[@class='fact' and contains(span, '')][1]"),
("Fact 2", "//div[@class='fact' and contains(span, '')][2]"),
]
# Print facts to csv
with open('facts.csv', 'a') as f:
csv_output = csv.writer(f)
entries = []
for name, xpath in text_entries:
entries.append(get_elements_by_xpath(driver, xpath))
csv_output.writerows(zip(*entries))
但现在我想使用相同的代码,但将其打印到 Gsheets。我尝试通过使用带有以下代码(如下)的 Pandas 来解决它:
import pygsheets
import pandas as pd
import csv
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from webdriver_manager.chrome import ChromeDriverManager
# Open target
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.exampleurl.com/facts')
# Define xpath
def get_elements_by_xpath(driver, xpath):
return [entry.text for entry in driver.find_elements_by_xpath(xpath)]
# Xpaths
text_entries = [
("Fact 1", "//div[@class='fact' and contains(span, '')][1]"),
("Fact 2", "//div[@class='fact' and contains(span, '')][2]"),
]
# Print facts to Gsheets
gc = pygsheets.authorize(service_file='/users/Username/desktop/map/filekey.json')
df = pd.DataFrame(entries)
sh = gc.open('apts')
wks = sh[0]
wks.set_dataframe(entries,(1,1))
该代码确实为一个 div 打印一个事实(尽管有两个事实和数百个 div 使用同一类)。将其打印到 CSV 确实为我提供了相应且正确打印的所有事实。每次我进行刮擦而不是向gsheet添加新行时,它也会覆盖文件......
任何帮助或想法将不胜感激!