Recently, I started to work on stock prices analyses in order to optimise my portfolio. I started with an Excel file and several VBA macros. It works quite well but is very slow. So, I'm now trying to step up and set up a proper "stock prices" database on my server (based on this post).
In the "stock_prices" database, there is a "daily_price" table that stores the daily stock prices for some tickers. In order to update the "daily price" table, a python script will be launched every day and it includes the below Python / SQL statements.
df = pdr.get_data_yahoo(ticker, start_date)
for row in df.itertuples():
values = [YAHOO_VENDOR_ID, ticker_index[ticker]] + list(row)
cursor.execute("INSERT INTO daily_price (data_vendor_id, ticker_id, price_date, open_price, high_price, low_price, close_price, adj_close_price, volume) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)", tuple(values))
Unfortunately, the "cursor.execute..." line returns the below error : "AttributeError: 'Timestamp' object has no attribute 'translate'"
The print out of the "values" tuple is : [1, 2, Timestamp('2004-08-19 00:00:00'), 49.81328582763672, 51.83570861816406, 47.80083084106445, 49.9826545715332, 49.9826545715332, 44871300]
Based on what I could read in another similar post, I checked the type of the date index to make sure that it is not an object :
Print(df.index.dtype)
This returns "datetime64[ns]" which seems good.
Finally, in the database, I have tried to change the data type from "Date" to "Datetime", but this doesn't solve the error.
Could anybody share some hints about how to resolve this error ?
Best Regards,
Edit on 25/04/2020 : Final solution
df = pdr.get_data_yahoo(ticker, start_date)
df = df.reset_index()
df.columns = ['price_date', 'open_price', 'high_price', 'low_price', 'close_price', 'adj_close_price', 'volume']
df['data_vendor_id'] = YAHOO_VENDOR_ID
df['ticker_id'] = ticker_index[ticker]
df = df[['data_vendor_id','ticker_id','price_date', 'open_price', 'high_price', 'low_price', 'close_price', 'adj_close_price', 'volume']]
df['price_date'] = df['price_date'].dt.strftime('%Y-%m-%d %H:%M:%S')
print(df)
cursor.executemany("INSERT INTO daily_price (data_vendor_id, ticker_id, price_date, open_price, high_price, low_price, close_price, adj_close_price, volume) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)", df.to_numpy().tolist())