python - 将 Excel 财务模型迁移到 Python Pandas 中的开瓶器计算

Question

我正在努力将 Excel 财务模型替换为 Python Pandas。财务模型是指随着时间的推移预测企业的现金流、损益表和资产负债表，而不是为掉期/期权定价或使用也称为财务模型的股票价格数据。很可能相同的概念和问题适用于后一种类型，我只是不太了解它们，因此无法发表评论。

到目前为止，我喜欢我看到的很多东西。我在 Excel 中使用的模型在页面顶部有一个共同的时间序列，定义了我们对预测感兴趣的时间段。然后计算以一系列行的形式向下运行。因此，每一行都是一个TimeSeries对象，或者行的集合成为一个DataFrame. 显然，您需要在这两个结构之间进行转置来读取，但这是一个微不足道的转换。

更好的是，每个 Excel 行都应该有一个通用的单一公式，并且仅基于页面上方的行。这适用于计算速度快且使用 Pandas 编写简单的向量运算。

我遇到的问题是当我尝试对开瓶器类型的计算进行建模时。这些通常用于模拟会计余额，其中一个时期的期初余额是上一时期的期末余额。您不能使用.shift()操作，因为给定期间的期末余额除其他外取决于同一期间的期初余额。这可能最好用一个例子来说明：

Time              2013-04-01   2013-05-01   2013-06-01   2013-07-01   ...

Opening Balance            0           +3           -2          -10     
[...]
Some Operations           +3           -5           -8          +20
[...]
Closing Balance           +3           -2          -10          +10

在伪代码中，我对如何计算这些事情的解决方案如下。它不是矢量化解决方案，看起来很慢

 # Set up date range
 dates = pd.date_range('2012-04-01',periods=500,freq='MS')

 # Initialise empty lists
 lOB = []
 lSomeOp1 = []
 lSomeOp2 = []
 lCB = []

 # Set the closing balance for the initial loop's OB
 sCB = 0

 # As this is a corkscrew calculation will need to loop through all dates
 for d in dates:

     # Create a datetime object as will reference it several times below
     dt = d.to_datetime()

     # Opening balance is either initial opening balance if at the
     # initial date or else the last closing balance from prior
     # period
     sOB = inp['ob'] if (dt == obDate) else sCB

     # Calculate some additions, write-off, amortisation, depereciation, whatever!
     sSomeOp1 = 10
     sSomeOp2 = -sOB / 2

     # Calculate the closing balance
     sCB = sOB + sSomeOp1 + sSomeOp2

     # Build up list of outputs
     lOB.append(sOB)
     lSomeOp1.append(sSomeOp1)
     lSomeOp2.append(sSomeOp2)
     lCB.append(sCB)

 # Convert lists to timeseries objects
 ob = pd.Series(lOB, index=dates)
 someOp1 = pd.Series(lSomeOp1, index=dates)
 someOp2 = pd.Series(lSomeOp2, index=dates)
 cb = pd.Series(lCB, index=dates)

我可以看到，如果你只有一两行操作，可能会有一些巧妙的技巧来矢量化计算，我会很高兴听到人们对这些技巧的任何提示。

然而，我必须制造的一些开瓶器有 100 次中间操作。在这些情况下，我最好的前进方式是什么？是接受 Python 的缓慢性能吗？我应该迁移到 Cython 吗？我还没有真正研究过它（所以可能会偏离基础），但后一种方法的问题是，如果我将 100 行移到 C 中，为什么我首先要打扰 Python，它不会感觉就像一个简单的升降机？

score 0 · Accepted Answer

以下内容进行就地更新，这应该会提高性能

import pandas as pd
import numpy as np
book=pd.DataFrame([[0, 3, np.NaN],[np.NaN,-5,np.NaN],[np.NaN,-8,np.NaN],[np.NaN,+20,np.NaN]], columns=['ob','so','cb'], index=['2013-04-01', '2013-05-01', '2013-06-01', '2013-07-01'])
for row in book.index[:-1]:
    book['cb'][row]=book.ix[row, ['ob', 'so']].sum()
    book['ob'][book.index.get_loc(row)+1]=book['cb'][row]
book['cb'][book.index[-1]]=book.ix[book.index[-1], ['ob', 'so']].sum()
book

python - 将 Excel 财务模型迁移到 Python Pandas 中的开瓶器计算

1 回答 1

Related

Reference