无需使用 Selenium 或遍历 url。这一切都是通过 api 实现的。您需要做的就是弄清楚有多少评论,这样您就可以知道要遍历多少页(在 json 中返回的每页最多有 1000 条评论)。
然后只需使用 pandas 转储到数据帧并直接写入磁盘。
代码:
import pandas as pd
from pandas.io.json import json_normalize
import requests
import math
request_url = 'https://company-profiles-api.cloud.seek.com.au/v1/companies/432298/reviews'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'}
# Get the total number of reviews to calculate how many pages to iterate through
payload = {
'page': '1',
'perPage': '1000'}
data = requests.get(request_url, headers=headers, params=payload).json()
total_reviews = data['paging']['total']
total_pages = math.ceil(total_reviews / 1000)
# Initialize a dataframe
results = pd.DataFrame()
# Iterate through pages and append to results dataframe
for page in range(1,total_pages+1):
payload = {
'page': '%s' %page,
'perPage': '1000'}
data = requests.get(request_url, headers=headers, params=payload).json()
temp_df = json_normalize(data['data'])
results = results.append(temp_df, sort=True).reset_index(drop=True)
results.to_csv('title.csv', index=False)
结果:
print (results.head(10).to_string())
companyName companyRecommended cons crowdflowerScore id isAnonymized jobTitle normalizedCfScore pros ratingBenefitsAndPerks ratingCareerOpportunity ratingCompanyOverall ratingDiversity ratingExecutiveManagement ratingStressLevel ratingWorkEnvironment ratingWorkLifeBalance reviewCreatedAt reviewCreatedTimeAgoText reviewResponse reviewResponseBy reviewResponseCreatedAt reviewResponseCreatedTimeAgoText reviewResponseForeignUserId roleProximityScore salarySummary salarySummaryDisplayText score timeAgoText title workLocation yearLeft yearLeftEmploymentStatusText yearsWorkedWith yearsWorkedWithText
0 Telstra True Layers of inefficient processes and business ... 3.0 5318069 False NBN Specialist 2000.0 Good work life balance and great office facili... 5 2 3 4.0 2 None 4 4 2019-07-12T03:27:46Z 2 months ago None None 0.002 fair Average 3.056594 None Telstra is a company with great values and the... Brisbane QLD, Australia left_2019 former employee 7_8_years 7 to 8 years
1 Telstra True The company is currently in a re-structure mod... 3.0 5317669 False Business Analyst 2000.0 High daily rate, modern office, my management ... 5 5 3 5.0 2 None 3 5 2019-07-07T20:57:28Z 2 months ago None None 0.002 generous High 3.056594 None It's still a great place is work, I've met a g... Sydney NSW, Australia left_2019 former employee 0_1_year Less than 1 year
2 Telstra False So many broken process which consume colossal ... 3.0 5316942 False Business Analyst 2000.0 You are empowered to deliver and make choices.... 1 1 3 5.0 2 None 2 5 2019-06-28T02:34:04Z 2 months ago None None 0.002 below Low 3.056594 None The learning curve has been very steep for me ... Melbourne VIC, Australia still_work_here current employee 3_4_years 3 to 4 years
3 Telstra False The company is driven by continuous short term... 3.0 5315055 False Senior Network Engineer 2000.0 Great talented people to work with and awesome... 5 2 2 5.0 1 None 2 1 2019-06-15T12:55:27Z 3 months ago None None 0.002 fair Average 3.056594 None Great place if you are single and starting out... Melbourne VIC, Australia left_2019 former employee 12_years_above Over 12 years
4 Telstra True Very hierarchal, too much red tape 3.0 5304650 False Account Executive 2000.0 Great opportunities to work anywhere in Austra... 5 5 4 5.0 4 None 4 3 2019-05-07T08:02:40Z 3 months ago None None 0.002 generous High 3.056594 None Great flexible working tools Cairns QLD, Australia left_2017 former employee 9_10_years 9 to 10 years
5 Telstra True Highly regulated environment and working towar... 3.0 5303808 False Operations Support 2000.0 Great career development. You get rewarded for... 5 5 4 5.0 4 None 4 4 2019-05-07T02:21:06Z 3 months ago None None 0.002 fair Average 3.056594 None Would definitely recommend Telstra to a friend... Adelaide SA, Australia None former employee None None
6 Telstra True Personal development for non IT/Engineering/no... 3.0 5307953 False Program Management 2000.0 I have thoroughly enjoyed working with Telstra... 3 3 4 4.0 4 None 5 5 2019-05-07T02:04:42Z 3 months ago None None 0.002 fair Average 3.056594 None Great people, great work and progressive think... Australia None former employee None None
7 Telstra False Executive would not make a decision on whether... 3.0 5307425 False Senior Customer Service Representative 2000.0 Opportunities were available outside the work ... 2 4 3 4.0 3 None 3 3 2019-05-06T13:43:00Z 3 months ago None None 0.002 below Low 3.056594 None Can't trust corporate Perth WA, Australia left_2015 former employee 1_2_years 1 to 2 years
8 Telstra False Current restructuring is so demoralising and m... 3.0 5299777 False Customer Service Role 2000.0 Opportunity to work in any job regardless of w... 2 2 2 4.0 1 None 2 2 2019-05-06T10:26:29Z 3 months ago None None 0.002 fair Average 3.056594 None Working through the restructure and trying to ... Sydney NSW, Australia left_2018 former employee 3_4_years 3 to 4 years
9 Telstra True Clients/customers. They need your attention wh... 3.0 5298991 False Sales Consultant 2000.0 Enjoyed working with a tight team. Programs be... 4 3 4 5.0 5 None 5 2 2019-05-06T09:31:18Z 3 months ago None None 0.002 fair Average 3.056594 None Would do it all over again in a heartbeat. Brisbane QLD, Australia left_2018 former employee 1_2_years 1 to 2 years