我的熊猫看起来像这样
Date Ticker Open High Low Adj Close Adj_Close Volume
2016-04-18 vws.co 445.0 449.2 441.7 447.3 447.3 945300
2016-04-19 vws.co 449.0 455.8 448.3 450.9 450.9 907700
2016-04-20 vws.co 451.0 452.5 435.4 436.6 436.6 1268100
2016-04-21 vws.co 440.1 442.9 428.4 435.5 435.5 1308300
2016-04-22 vws.co 435.5 435.5 435.5 435.5 435.5 0
2016-04-25 vws.co 431.0 436.7 424.4 430.0 430.0 1311700
2016-04-18 nflx 109.9 110.7 106.02 108.4 108.4 27001500
2016-04-19 nflx 99.49 101.37 94.2 94.34 94.34 55623900
2016-04-20 nflx 94.34 96.98 93.14 96.77 96.77 25633600
2016-04-21 nflx 97.31 97.38 94.78 94.98 94.98 19859400
2016-04-22 nflx 94.85 96.69 94.21 95.9 95.9 15786000
2016-04-25 nflx 95.7 95.75 92.8 93.56 93.56 14965500
我有一个程序可以在其中一个具有嵌入式功能的功能上成功运行 groupby。
这条线看起来像这样
df['MA3'] = df.groupby('Ticker').Adj_Close.transform(lambda group: pd.rolling_mean(group, window=3))
在此处查看我的初始问题和数据格式:
仅在同一 df 中的 df col 行中选择一个值来计算来自不同 val 的结果,并且一次仅在一个代码上计算 df
现在我意识到,与其在我有 5 个的每个嵌入式函数中执行 groupby,我宁愿让 groupby 在调用顶部函数的主程序中运行,这样所有嵌入式函数都可以在过滤后的 groupby pandas 上运行数据框只做一次 groupby ......
如何使用 groupby 应用我的主要功能,以过滤我的熊猫,所以我一次只处理一个代码(col 'Ticker' 中的值)?
“Ticker”列包含“aapl”、“msft”、“nflx”公司标识符等,以及时间窗口的时间序列数据。
非常感谢卡拉辛斯基。这接近我想要的。但我得到一个错误。
当我运行时:
def Screener(df_all, group):
# Copy df_all to df for single ticker operations
df = df_all.copy()
def diff_calc(df,ticker):
df['Difference'] = df['Adj_Close'].diff()
return df
df = diff_calc(df, ticker)
return df_all
for ticker in stocklist:
df_all[['Difference']] = df_all.groupby('Ticker').Adj_Close.apply(Screener, ticker)
我收到此错误:
Traceback (most recent call last):
File "<ipython-input-2-d7c1835f6b2a>", line 1, in <module>
runfile('C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py', wdir='C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox')
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py", line 144, in <module>
df_all[['Difference']] = df_all.groupby('Ticker').Adj_Close.apply(Screener, ticker)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 663, in apply
return self._python_apply_general(f)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 667, in _python_apply_general
self.axis)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 1286, in apply
res = f(group)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 659, in f
return func(g, *args, **kwargs)
File "C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py", line 112, in Screener
df = diff_calc(df, ticker)
File "C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py", line 70, in diff_calc
df['Difference'] = df['Adj_Close'].diff()
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\series.py", line 514, in __getitem__
result = self.index.get_value(self, key)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\tseries\index.py", line 1221, in get_value
raise KeyError(key)
KeyError: 'Adj_Close'
当我像这样使用 functools
df_all = functools.partial(df_all.groupby('Ticker').Adj_Close.apply(Screener, ticker))
我得到与上述相同的错误...
Traceback (most recent call last):
File "<ipython-input-5-d7c1835f6b2a>", line 1, in <module>
runfile('C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py', wdir='C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox')
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py", line 148, in <module>
df_all = functools.partial(df_all.groupby('Ticker').Adj_Close.apply(Screener, [ticker]))
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 663, in apply
return self._python_apply_general(f)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 667, in _python_apply_general
self.axis)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 1286, in apply
res = f(group)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 659, in f
return func(g, *args, **kwargs)
File "C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py", line 114, in Screener
df = diff_calc(df, ticker)
File "C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py", line 72, in diff_calc
df['Difference'] = df['Adj_Close'].diff()
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\series.py", line 514, in __getitem__
result = self.index.get_value(self, key)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-
3.3.5.amd64\lib\site-packages\pandas\tseries\index.py", line 1221, in get_value
raise KeyError(key)
KeyError: 'Adj_Close'
编辑自卡拉辛斯基 31/5 的编辑。
当我运行 Karasinski 的最后一个建议时,我收到了这个错误。
mmm
mmm
nflx
vws.co
Traceback (most recent call last):
File "<ipython-input-4-d7c1835f6b2a>", line 1, in <module>
runfile('C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py', wdir='C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox')
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users/Morten/Documents/Design/Python/CrystalBall - Local - Git/Git - CrystalBall/sandbox/screener_test simple for StockOverflowNestedFct_Getstock.py", line 173, in <module>
df_all[['mean', 'max', 'median', 'min']] = df_all.groupby('Ticker').apply(group_func)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 663, in apply
return self._python_apply_general(f)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 670, in _python_apply_general
not_indexed_same=mutated)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 2785, in _wrap_applied_output
not_indexed_same=not_indexed_same)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\groupby.py", line 1142, in _concat_objects
result = result.reindex_axis(ax, axis=self.axis)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\frame.py", line 2508, in reindex_axis
fill_value=fill_value)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\generic.py", line 1841, in reindex_axis
{axis: [new_index, indexer]}, fill_value=fill_value, copy=copy)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\generic.py", line 1865, in _reindex_with_indexers
copy=copy)
File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\pandas\core\internals.py", line 3144, in reindex_indexer
raise ValueError("cannot reindex from a duplicate axis")
ValueError: cannot reindex from a duplicate axis