0

所以以前我在将我想要的信息从列表写入 csv 时遇到了一些问题。在给定以下代码的情况下,我在这里得到了一些帮助,并设法做我想做的事情:

import csv
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from webdriver_manager.chrome import ChromeDriverManager

# open target
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.exampleurl.com/facts')

# define xpath
def get_elements_by_xpath(driver, xpath):
    return [entry.text for entry in driver.find_elements_by_xpath(xpath)]

# Xpaths
text_entries = [
    ("Fact 1", "//div[@class='fact' and contains(span, '')][1]"),
    ("Fact 2", "//div[@class='fact' and contains(span, '')][2]"),
    ]

# Print facts to csv
with open('facts.csv', 'a') as f:
    csv_output = csv.writer(f)
    entries = []
    for name, xpath in text_entries:
        entries.append(get_elements_by_xpath(driver, xpath))
    csv_output.writerows(zip(*entries))

但现在我想使用相同的代码,但将其打印到 Gsheets。我尝试通过使用带有以下代码(如下)的 Pandas 来解决它:

import pygsheets
import pandas as pd
import csv
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from webdriver_manager.chrome import ChromeDriverManager

# Open target
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.exampleurl.com/facts')

# Define xpath
def get_elements_by_xpath(driver, xpath):
    return [entry.text for entry in driver.find_elements_by_xpath(xpath)]

# Xpaths
text_entries = [
    ("Fact 1", "//div[@class='fact' and contains(span, '')][1]"),
    ("Fact 2", "//div[@class='fact' and contains(span, '')][2]"),
    ]

# Print facts to Gsheets
gc = pygsheets.authorize(service_file='/users/Username/desktop/map/filekey.json')
df = pd.DataFrame(entries)
sh = gc.open('apts')
wks = sh[0]
wks.set_dataframe(entries,(1,1))

该代码确实为一个 div 打印一个事实(尽管有两个事实和数百个 div 使用同一类)。将其打印到 CSV 确实为我提供了相应且正确打印的所有事实。每次我进行刮擦而不是向gsheet添加新行时,它也会覆盖文件......

任何帮助或想法将不胜感激!

4

0 回答 0