我编写了一个从文件中获取 URL 并同时向所有 URL 发送 HTTP 请求的脚本。我现在想限制会话中每秒 HTTP 请求的数量和每个接口(、等)的eth0
带宽eth1
。有没有办法在 Python 上实现这一点?
问问题
6708 次
3 回答
3
您可以使用 Semaphore 对象,它是标准 Python 库的一部分: python doc
或者,如果您想直接使用线程,您可以使用 wait([timeout])。
没有与 Python 捆绑在一起的库可以在以太网或其他网络接口上工作。你可以去的最低的是socket。
根据您的回复,这是我的建议。注意active_count。仅使用它来测试您的脚本是否只运行两个线程。那么在这种情况下,它们将是三个,因为第一个是您的脚本,然后您有两个 URL 请求。
import time
import requests
import threading
# Limit the number of threads.
pool = threading.BoundedSemaphore(2)
def worker(u):
# Request passed URL.
r = requests.get(u)
print r.status_code
# Release lock for other threads.
pool.release()
# Show the number of active threads.
print threading.active_count()
def req():
# Get URLs from a text file, remove white space.
urls = [url.strip() for url in open('urllist.txt')]
for u in urls:
# Thread pool.
# Blocks other threads (more than the set limit).
pool.acquire(blocking=True)
# Create a new thread.
# Pass each URL (i.e. u parameter) to the worker function.
t = threading.Thread(target=worker, args=(u, ))
# Start the newly create thread.
t.start()
req()
于 2014-09-29T11:41:29.473 回答
0
您可以使用文档中描述的工作人员概念: https ://docs.python.org/3.4/library/queue.html
在您的工作人员中添加一个 wait() 命令,让他们在请求之间等待(在文档中的示例中:在 task_done 之后的“while true”内)。
示例:5 个“Worker”-请求之间等待时间为 1 秒的线程每秒执行的抓取次数将少于 5 次。
于 2014-09-29T12:35:22.387 回答
0
请注意,以下解决方案仍以串行方式发送请求,但会限制 TPS(每秒事务数)
TLDR;有一个类可以计算当前秒内仍然可以进行的呼叫次数。每秒每拨打一次电话并重新填充一次,它就会递减。
import time
from multiprocessing import Process, Value
# Naive TPS regulation
# This class holds a bucket of tokens which are refilled every second based on the expected TPS
class TPSBucket:
def __init__(self, expected_tps):
self.number_of_tokens = Value('i', 0)
self.expected_tps = expected_tps
self.bucket_refresh_process = Process(target=self.refill_bucket_per_second) # process to constantly refill the TPS bucket
def refill_bucket_per_second(self):
while True:
print("refill")
self.refill_bucket()
time.sleep(1)
def refill_bucket(self):
self.number_of_tokens.value = self.expected_tps
print('bucket count after refill', self.number_of_tokens)
def start(self):
self.bucket_refresh_process.start()
def stop(self):
self.bucket_refresh_process.kill()
def get_token(self):
response = False
if self.number_of_tokens.value > 0:
with self.number_of_tokens.get_lock():
if self.number_of_tokens.value > 0:
self.number_of_tokens.value -= 1
response = True
return response
def test():
tps_bucket = TPSBucket(expected_tps=1) ## Let's say I want to send requests 1 per second
tps_bucket.start()
total_number_of_requests = 60 ## Let's say I want to send 60 requests
request_number = 0
t0 = time.time()
while True:
if tps_bucket.get_token():
request_number += 1
print('Request', request_number) ## This is my request
if request_number == total_number_of_requests:
break
print (time.time() - t0, ' time elapsed') ## Some metrics to tell my how long every thing took
tps_bucket.stop()
if __name__ == "__main__":
test()
于 2020-08-28T14:31:24.103 回答