-2

我正在尝试从 Seek.com 提取公司评论。我面临的问题是,当我尝试提取评论的标题时,在 2-3 个标题之后它会给出一条错误消息,并且每次它都会生成不同的错误消息。代码如下:

from bs4 import BeautifulSoup
import requests
from csv import writer

response = requests.get('https://www.seek.com.au/companies/telstra-432298/reviews')

soup = BeautifulSoup(response.text,'html.parser')

links = soup.find_all('a', attrs ={'data-automation' :'reviewCard'})
hrefs = [link['href'] for link in links]
# print(hrefs)

with open('title.csv', 'w') as csv_file:
    csv_writer = writer(csv_file)
    csv_writer.writerow("title")
    for href in hrefs:
        print("something")
        pages= requests.get('https://www.seek.com.au' + href)
        soup2= BeautifulSoup(pages.text, 'html.parser')
        title = soup2.find_all(class_='_3FrNV7v HfVIlOd E6m4BZb')
        csv_writer.writerow(title)

我无法弄清楚如何仅从页面中提取信息以及为什么错误会一次又一次地弹出。

4

1 回答 1

1

无需使用 Selenium 或遍历 url。这一切都是通过 api 实现的。您需要做的就是弄清楚有多少评论,这样您就可以知道要遍历多少页(在 json 中返回的每页最多有 1000 条评论)。

然后只需使用 pandas 转储到数据帧并直接写入磁盘。

代码:

import pandas as pd
from pandas.io.json import json_normalize
import requests
import math

request_url = 'https://company-profiles-api.cloud.seek.com.au/v1/companies/432298/reviews'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'}


# Get the total number of reviews to calculate how many pages to iterate through
payload = {
'page': '1',
'perPage': '1000'}

data = requests.get(request_url, headers=headers, params=payload).json()
total_reviews = data['paging']['total']
total_pages = math.ceil(total_reviews / 1000)

# Initialize a dataframe
results = pd.DataFrame()

# Iterate through pages and append to results dataframe
for page in range(1,total_pages+1):
    payload = {
            'page': '%s' %page,
            'perPage': '1000'}

    data = requests.get(request_url, headers=headers, params=payload).json()
    temp_df = json_normalize(data['data'])

    results = results.append(temp_df, sort=True).reset_index(drop=True)

results.to_csv('title.csv', index=False)

结果:

print (results.head(10).to_string())
  companyName  companyRecommended                                               cons  crowdflowerScore       id  isAnonymized                                jobTitle  normalizedCfScore                                               pros  ratingBenefitsAndPerks  ratingCareerOpportunity  ratingCompanyOverall  ratingDiversity  ratingExecutiveManagement ratingStressLevel  ratingWorkEnvironment  ratingWorkLifeBalance       reviewCreatedAt reviewCreatedTimeAgoText reviewResponse reviewResponseBy reviewResponseCreatedAt reviewResponseCreatedTimeAgoText reviewResponseForeignUserId  roleProximityScore salarySummary salarySummaryDisplayText     score timeAgoText                                              title              workLocation         yearLeft yearLeftEmploymentStatusText yearsWorkedWith yearsWorkedWithText
0     Telstra                True  Layers of inefficient  processes and business ...               3.0  5318069         False                          NBN Specialist             2000.0  Good work life balance and great office facili...                       5                        2                     3              4.0                          2              None                      4                      4  2019-07-12T03:27:46Z             2 months ago                                                    None                                                         None               0.002          fair                  Average  3.056594        None  Telstra is a company with great values and the...   Brisbane QLD, Australia        left_2019              former employee       7_8_years        7 to 8 years
1     Telstra                True  The company is currently in a re-structure mod...               3.0  5317669         False                        Business Analyst             2000.0  High daily rate, modern office, my management ...                       5                        5                     3              5.0                          2              None                      3                      5  2019-07-07T20:57:28Z             2 months ago                                                    None                                                         None               0.002      generous                     High  3.056594        None  It's still a great place is work, I've met a g...     Sydney NSW, Australia        left_2019              former employee        0_1_year    Less than 1 year
2     Telstra               False  So many broken process which consume colossal ...               3.0  5316942         False                        Business Analyst             2000.0  You are empowered to deliver and make choices....                       1                        1                     3              5.0                          2              None                      2                      5  2019-06-28T02:34:04Z             2 months ago                                                    None                                                         None               0.002         below                      Low  3.056594        None  The learning curve has been very steep for me ...  Melbourne VIC, Australia  still_work_here             current employee       3_4_years        3 to 4 years
3     Telstra               False  The company is driven by continuous short term...               3.0  5315055         False                 Senior Network Engineer             2000.0  Great talented people to work with and awesome...                       5                        2                     2              5.0                          1              None                      2                      1  2019-06-15T12:55:27Z             3 months ago                                                    None                                                         None               0.002          fair                  Average  3.056594        None  Great place if you are single and starting out...  Melbourne VIC, Australia        left_2019              former employee  12_years_above       Over 12 years
4     Telstra                True                 Very hierarchal, too much red tape               3.0  5304650         False                       Account Executive             2000.0  Great opportunities to work anywhere in Austra...                       5                        5                     4              5.0                          4              None                      4                      3  2019-05-07T08:02:40Z             3 months ago                                                    None                                                         None               0.002      generous                     High  3.056594        None                       Great flexible working tools     Cairns QLD, Australia        left_2017              former employee      9_10_years       9 to 10 years
5     Telstra                True  Highly regulated environment and working towar...               3.0  5303808         False                      Operations Support             2000.0  Great career development. You get rewarded for...                       5                        5                     4              5.0                          4              None                      4                      4  2019-05-07T02:21:06Z             3 months ago                                                    None                                                         None               0.002          fair                  Average  3.056594        None  Would definitely recommend Telstra to a friend...    Adelaide SA, Australia             None              former employee            None                None
6     Telstra                True  Personal development for non IT/Engineering/no...               3.0  5307953         False                      Program Management             2000.0  I have thoroughly enjoyed working with Telstra...                       3                        3                     4              4.0                          4              None                      5                      5  2019-05-07T02:04:42Z             3 months ago                                                    None                                                         None               0.002          fair                  Average  3.056594        None  Great people, great work and progressive think...                 Australia             None              former employee            None                None
7     Telstra               False  Executive would not make a decision on whether...               3.0  5307425         False  Senior Customer Service Representative             2000.0  Opportunities were available outside the work ...                       2                        4                     3              4.0                          3              None                      3                      3  2019-05-06T13:43:00Z             3 months ago                                                    None                                                         None               0.002         below                      Low  3.056594        None                              Can't trust corporate       Perth WA, Australia        left_2015              former employee       1_2_years        1 to 2 years
8     Telstra               False  Current restructuring is so demoralising and m...               3.0  5299777         False                   Customer Service Role             2000.0  Opportunity to work in any job regardless of w...                       2                        2                     2              4.0                          1              None                      2                      2  2019-05-06T10:26:29Z             3 months ago                                                    None                                                         None               0.002          fair                  Average  3.056594        None  Working through the restructure and trying to ...     Sydney NSW, Australia        left_2018              former employee       3_4_years        3 to 4 years
9     Telstra                True  Clients/customers. They need your attention wh...               3.0  5298991         False                        Sales Consultant             2000.0  Enjoyed working with a tight team. Programs be...                       4                        3                     4              5.0                          5              None                      5                      2  2019-05-06T09:31:18Z             3 months ago                                                    None                                                         None               0.002          fair                  Average  3.056594        None        Would do it all over again in a heartbeat.    Brisbane QLD, Australia        left_2018              former employee       1_2_years        1 to 2 years
于 2019-09-18T14:20:15.340 回答