1

我有一个难题要解决:我有一个 Instagram 用户列表,我需要从每个帐户中提取“关注”列表。为了简化请求,(使用 Python)我使用了一个名为“igramscraper”的模块。我以这种方式构造了脚本:我创建了两个函数(一个用于提取关注者,另一个用于将其放入数据库)我创建了一个 for 循环,该循环遍历用户名,并为循环中的每个用户调用这两个函数。我在用户的迭代中放了一个 time.sleep 时间。在提取关注者的代码中,我首先需要检查帐户是否仍然存在,然后我需要发出请求以获取帐户是否为私有,然后如果 account-is_private==False 我提取相关关注者。

正如我所说,我在用户循环中放置了大约 2 分钟的睡眠时间,在帐户私有请求和帐户跟随请求之间有一个 time.sleep,最后,如果我收到 429 错误太多请求,则通过 try/except ,一次。睡眠约2小时。

问题是即使我等待 2 小时,Instagram 也允许我执行前 100/150 个请求,然后每次我尝试执行此请求时都会拒绝我的请求。

有什么方法或建议可以避免这个问题吗?代码如下:

from igramscraper.instagram import Instagram
import requests
import json
import pandas as pd
import time
from pymongo import MongoClient
import random

global login_username, instagram, login_password,  client,db,user_data, data, bio

def getUserFollowing(username):
    global  followings

    followings = []
    headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36',
        }

    try:
        account = instagram.get_account(username)

        if account.is_private == 1:
            return print("account privated. Skipping...")
        else:
            while True:
                count = 0
                try:
                    following = instagram.get_following(account.identifier, account.follows_count, 30, delayed=True)
                    for following_user in following['accounts']:
                        if following_user.is_verified == 1:
                                                      followings.append(following_user.username)

                return username, followings, print('following scraped successfully.')

            except Exception as e:
                print(e)

except Exception as e:
    return print("account doesn't exist or some error occurred.." + str(e))


def queryUserFollowing(username, followings):
    try:
        if not followings:
            return print('not insert due to private account')
        else:
            userself = {
            "username": username,
            "following": followings,
            }
            query = user_data.insert_one(userself)
        return print('data added.')
    except Exception as e:
        return ('Something went wrong in querying.')


df = pd.read_csv('/home/rootanalytics/Scrivania/follower.csv')

instagram = Instagram()


client = MongoClient()
db = client['DataUsers']
user_data = db['user-following']
cnt = 0

def loginOne():
    usm = 'username'
    password = 'pwd'
    instagram.with_credentials(usm, password)
    log = instagram.login()

    return log

loginOne()

for user in df.iterrows():
    username = user[1][0]
    getUserFollowing(username)
    queryUserFollowing(username, followings)
4

0 回答 0