编辑:我意识到我之前尝试描述问题的方式并不是很有帮助,实际上并没有很好地模拟我目前正在做的事情,所以我重写了这篇文章。我已经包含了我的工作代码。我将使用的示例中的数据是借贷俱乐部贷款数据(csv格式),可以从这里下载:https ://www.kaggle.com/wendykan/lending-club-loan-data
我目前在 python 中使用 PuLP 和 pandas 来解决我目前遇到的优化问题。我对线性规划很陌生,最近利用它来解决有关最小化成本问题的问题。
对于这个借贷俱乐部的例子,假设我有 100,000 美元。鉴于借贷俱乐部贷款被证券化并且可以由个人投资,我想分配这 100,000 美元,以便:
1. 投资10万美元的加权平均利率不能低于8%。
2.投资10万美元的加权平均债务收入比不能超过10
暂时忽略(3。为了尽量减少任何一笔贷款的风险,优化后的投资组合的平均贷款规模不能超过 10,000 美元。)
我在实施约束 3 和 100,000 美元的投资上限时遇到了困难。对于约束 1 和 2,我已按贷款规模对贷款进行加权,并将利率和债务收入比率乘以权重,因此可以以线性方式对这些约束进行建模,使得:
Sum of weighted interest rates => 0.08
Sum of weighted debt-income ratios <= 10
购买的最终贷款组合不需要完全等于 100,000 美元。目标函数是在约束(即 LpMaximise)内接近 100,000。
我选择将贷款选择建模为二元变量,因为我只是想知道它们是进还是出。此外,为了速度和内存,我从数据中选择了 50 行的切片来处理。
这是我的代码:
import pandas as pd
from pulp import *
import numpy as np
df = pd.read_csv('~/Desktop/loan.csv')
df.columns = [c.replace(' ','') for c in df.columns]
#cleaning up the data to get rid of spaces and to standardise percentage data
df.loc[:,'id'] = df.loc[:,'id'].astype(int)
df['int_rate'] = df['int_rate']/100 #convert interest rate to ratios.
#slicing the data to get a small sample of 50 loans
df = df.iloc[0:49,:]
#setting up the weighted averages for linear equations
sumloans = df.loc[:,'funded_amnt'].sum()
df['weights'] = df['funded_amnt'].div(sumloans,axis='index')
#Converting dataframe to weighted values!
df2 = df[["id","funded_amnt","dti","int_rate"]]
df2[["funded_amntwtd","dtiwtd","int_ratewtd"]] = df[["funded_amnt","dti","int_rate"]].multiply(df["weights"],axis="index")
df3 = pd.merge(df,df2.iloc[:,[4,5,6]],on=df["id"],how='left')
#Free up memory
df = None
df2 = None
#Variable construction
loanid = df3['id'].tolist()
dtiwtd = df3.set_index('id').to_dict()['dtiwtd']
loanmix = df3.set_index('id').to_dict()['funded_amnt']
wtdloanmix = df3.set_index('id').to_dict()['funded_amntwtd']
wa_int = df3.set_index('id').to_dict()['int_ratewtd']
id_vars = LpVariable.dicts("ID",indexs=loanid, cat='Integer',lowBound=0, upBound=1)
#Objective function added first. Summing all the loan values but examining constraints
prob = LpProblem("Funding",pulp.LpMaximize)
prob += lpSum([loanmix[i]*id_vars[i] for i in id_vars])
prob += lpSum([loanmix[i]*id_vars[i] for i in id_vars]) <= 100000 #"Sum of loans purchased must be equal to or less than $100,000"
prob += lpSum([dtiwtd[i]*id_vars[i] for i in id_vars]) <= 10 #"Sum of weighted dtis cannot be greater than 10"
prob += lpSum([wa_int[i]*id_vars[i] for i in id_vars]) >= 0.08 #"Sum of weighted interest rates cannot be less than 8%"
#Placeholder for inserting constraint on avg. loan size
prob.solve()
print("Status:", pulp.LpStatus[prob.status])
for v in prob.variables():
print(v.name, "=", v.varValue)
print("Total amount invested = ", value(prob.objective))
解决方案状态显示为“不可行”,输出有一些非二进制整数。
我将不胜感激有关此问题的任何帮助。我是线性代数和高等数学的新手,但我已经阅读了这个页面(http://lpsolve.sourceforge.net/5.1/ratio.htm),它帮助我设置了前两个约束。我只是纠结于如何编写一个方程式或代码,以确保优化的投资组合的平均贷款价值低于 10,000 美元。