0

当我从 yfinance 下载数据时,每个代码都有 8 列(开盘价、高价、低价等)。由于我正在下载 15 个代码,因此我有 120 个列和 1 个索引列(日期)。它们水平相加。见图 1

我只想要 8 个唯一的列,而不是在 2 个级别中拥有那么多列。再加上创建一个标识代码的新列。参见图 2。

图 1:当前表格

图 1,但在原始文本中:

    Adj Close   ... Volume
DANHOS13.MX FCFE18.MX   FHIPO14.MX  FIBRAHD15.MX    FIBRAMQ12.MX    FIBRAPL14.MX    FIHO12.MX   FINN13.MX   FMTY14.MX   FNOVA17.MX  ... FIBRAPL14.MX    FIHO12.MX   FINN13.MX   FMTY14.MX   FNOVA17.MX  FPLUS16.MX  FSHOP13.MX  FUNO11.MX   FVIA16.MX   TERRA13.MX
Date                                                                                    
2015-01-02  26.065336   NaN 18.526043   NaN 16.337654   18.520781   14.683501   11.301384   9.247743    NaN ... 338697  189552  148064  57  NaN NaN 212451  2649823 NaN 1111343
2015-01-05  24.670488   NaN 18.436762   NaN 15.857328   17.859756   13.795850   11.071105   9.209846    NaN ... 449555  364819  244594  19330   NaN NaN 491587  3317923 NaN 1255128

图 2:期望的结果

我应用的代码是:

start = dt.datetime(2015,1,1)
end = dt.datetime.now()

df = yf.download("FUNO11.MX FIBRAMQ12.MX FIHO12.MX DANHOS13.MX FINN13.MX FSHOP13.MX TERRA13.MX FMTY14.MX FIBRAPL14.MX FHIPO14.MX FIBRAHD15.MX FPLUS16.MX FVIA16.MX FNOVA17.MX FCFE18.MX", 
                start = start,
                end = end,
                group_by = 'Ticker',
                actions = True)
4

2 回答 2

0

我将下载数据略有不同:

import yfinance as yf
from datetime import datetime as dt
from dateutil.relativedelta import relativedelta

start = dt(2015,1,1)
end = dt.now()
symbols = ["FUNO11.MX", "FIBRAMQ12.MX", "FIHO12.MX", "DANHOS13.MX", "FINN13.MX", "FSHOP13.MX", "TERRA13.MX", "FMTY14.MX",
           "FIBRAPL14.MX", "FHIPO14.MX", "FIBRAHD15.MX", "FPLUS16.MX", "FVIA16.MX", "FNOVA17.MX", "FCFE18.MX"]

data = yf.download(symbols, start=start, end=end, actions=True)

然后选项1:

def reshaper(symb, dframe):
    df = dframe.unstack().reset_index()
    df.columns = ['variable','symbol','Date','Value']
    df = df.loc[df.symbol==symb,['Date','variable','Value']].pivot_table(index='Date', columns='variable', values='Value').reset_index()
    df.columns.name = ''
    df['Ticker'] = symb
    return df


h = pd.DataFrame()

for s in symbols:
    h = h.append(reshaper(s, data), ignore_index=True)
    
h

在此处输入图像描述

选项 2:对于单线,您可以这样做:

data.stack().reset_index().rename(columns={'level_1':'Ticker'})

在此处输入图像描述

于 2020-11-30T03:56:04.263 回答
0

一个稍微简单的版本依赖于首先堆叠两列索引级别(度量和股票代码)以获得长格式的整洁数据,然后在度量级别上堆叠,将股票代码和日期保持为索引:

import yfinance as yf

symbols = ["FUNO11.MX", "FIBRAMQ12.MX", "FIHO12.MX", "DANHOS13.MX", 
           "FINN13.MX", "FSHOP13.MX", "TERRA13.MX", "FMTY14.MX",
           "FIBRAPL14.MX", "FHIPO14.MX", "FIBRAHD15.MX", "FPLUS16.MX", 
           "FVIA16.MX", "FNOVA17.MX", "FCFE18.MX"]

data = yf.download(symbols, start='2015-01-01', end='2020-11-15', actions=True)

data_reshape=data.stack(level=[0,1]).unstack(1)
data_reshape.index=data_reshape.index.set_names(['ticker'],level=[1])
data_reshape.head()

data_reshape.head()

                         Adj Close      Close  Dividends       High  \
Date       ticker                                                     
2015-01-02 DANHOS13.MX   26.065336  37.000000        0.0  37.400002   
           FHIPO14.MX    18.526043  24.900000        0.0  24.900000   
           FIBRAMQ12.MX  16.337654  24.490000        0.0  25.110001   
           FIBRAPL14.MX  18.520781  26.740801        0.0  27.118500   
           FIHO12.MX     14.683501  21.670000        0.0  22.190001   

                               Low       Open  Stock Splits     Volume  
Date       ticker                                                       
2015-01-02 DANHOS13.MX   36.330002  36.330002           0.0    82849.0  
           FHIPO14.MX    24.900000  24.900000           0.0    94007.0  
           FIBRAMQ12.MX  24.350000  24.990000           0.0  1172917.0  
           FIBRAPL14.MX  26.343100  26.750700           0.0   338697.0  
           FIHO12.MX     21.209999  22.120001           0.0   189552.0  
于 2020-11-30T04:44:45.800 回答