我很想知道脚本中每个动作所花费的时间。下面的脚本抓取未来 10 天内发布收益的股票,然后抓取它们当前的股价,最后从 yfinance API 抓取我感兴趣的其他项目。
当我使用 tqdm 包中的状态跟踪器“trange()”时,我遇到了各种各样的问题。该脚本需要很长时间才能运行,并且在从 API 提取基本面和技术数据的最后一个块中,该脚本为每只股票重复请求 x 次,其中 x 是 Symbols 列表中的股票总数。
有人可以帮我理解我试图合并的 tqdm 功能出了什么问题吗?:
import datetime
import pandas as pd
import time
import requests
import yfinance as yf
from tqdm import trange
import sys
StartTime = time.time()
#####################################################
### ###
### Grab Stocks with Earnings in Next 30 Days ###
### ###
#####################################################
CalendarDays = 30 #<-- specify the number of calendar days you want to grab earnings release info for
tables = [] #<-- initialize an empty list to store your tables
print('1. Grabbing companies with earnings releases in the next ' + str(CalendarDays) + ' days.')
# for i in trange(CalendarDays, file = sys.stdout, desc = '1. Grabbing companies with earnings releases in the next ' + str(CalendarDays) + ' days'):
for i in range(CalendarDays): #<-- Grabs earnings release info for the next x days on the calendar
try:
date = (datetime.date.today() + datetime.timedelta(days = i )).isoformat() #get tomorrow in iso format as needed'''
pd.set_option('display.max_column',None)
url = pd.read_html("https://finance.yahoo.com/calendar/earnings?day="+date, header=0)
table = url[0]
table['Earnings Release Date'] = date
tables.append(table) #<-- append each table into your list of tables
except ValueError:
continue
df = pd.concat(tables, ignore_index = True) #<-- take your list of tables into 1 final dataframe
df_unique = df.drop_duplicates(subset=['Symbol'], keep='first', ignore_index = True)
DataSet = df_unique.drop(['Reported EPS','Surprise(%)'], axis = 1)
Symbols = df_unique['Symbol'].to_list()
###################################
### ###
### Grab Latest Stock Price ###
### ###
###################################
print('2. Grabbing latest share prices for ' + str(len(Symbols)) + ' stocks.')
df_temp = pd.DataFrame()
# for i in trange(len(Symbols), file = sys.stdout, desc = '2. Grabbing latest stock prices'):
for symbol in Symbols:
try:
params = {'symbols': symbol,
'range': '1d',
'interval': '1d',
'indicators': 'close',
'includeTimestamps': 'false',
'includePrePost': 'false',
'corsDomain': 'finance.yahoo.com',
'.tsrc': 'finance'
}
url = 'https://query1.finance.yahoo.com/v7/finance/spark'
r = requests.get(url, params=params)
data = r.json()
Price = data['spark']['result'][0]['response'][0]['indicators']['quote'][0]['close'][0]
df_stock = pd.DataFrame({'Symbol' : [symbol],
'Current Price' : [Price]
})
df_temp = df_temp.append(df_stock)
except KeyError:
continue
DataSet = pd.merge(DataSet, df_temp[['Symbol', 'Current Price']], on = 'Symbol', how = "left")
###########################################
### ###
### Grab Other Important Stock Info ###
### ###
###########################################
print('3. Grabbing stock fundamental and technical metrics.')
StartTime = time.time()
df_temp2 = pd.DataFrame()
# for i in trange(len(Symbols), file = sys.stdout, desc = 'Grabbing stock fundamental and technical metrics'):
for symbol in Symbols:
try:
Ticker = yf.Ticker(symbol).info
Sector = Ticker.get('sector')
Industry = Ticker.get('industry')
P2B = Ticker.get('priceToBook')
P2E = Ticker.get('trailingPE')
# print(symbol, Sector, Industry, P2B, P2E)
df_stock = pd.DataFrame({'Symbol' : [symbol],
'Sector' : [Sector],
'Industry' : [Industry],
'PriceToBook' : [P2B],
'PriceToEarnings' : [P2E],
})
df_temp2 = df_temp2.append(df_stock)
except: KeyError
pass
DataSet = pd.merge(DataSet, df_temp2, on = 'Symbol', how = "left")
##############################################################################
##############################################################################
##############################################################################
ExecutionTime = (time.time() - StartTime)
print('Script is complete! This script took ' + format(str(round(ExecutionTime, 1))) + ' seconds to run.')
TodaysDate = datetime.date.today().isoformat()