2

我使用 R 计算 SP500 中所有股票的切线投资组合。

股权列表是通过 python 脚本加载的

import urllib2
import pytz
import pandas as pd
import numpy as np

from bs4 import BeautifulSoup
from datetime import datetime
from pandas.io.data import DataReader


SITE = "http://en.wikipedia.org/wiki/List_of_S%26P_500_companies"
START = datetime(1900, 1, 1, 0, 0, 0, 0, pytz.utc)
END = datetime.today().utcnow()


def scrape_list(site):
    hdr = {'User-Agent': 'Mozilla/5.0'}
    req = urllib2.Request(site, headers=hdr)
    page = urllib2.urlopen(req)
    soup = BeautifulSoup(page)

    table = soup.find('table', {'class': 'wikitable sortable'})
    sector_tickers = dict()
    for row in table.findAll('tr'):
        col = row.findAll('td')
        if len(col) > 0:
            sector = str(col[3].string.strip()).lower().replace(' ', '_')
            ticker = str(col[0].string.strip())
            if sector not in sector_tickers:
                sector_tickers[sector] = list()
            sector_tickers[sector].append(ticker)
    return sector_tickers



# export sp500 in a list    
def get_sp500_all():
    sector_tickers = scrape_list(SITE)
    a = sector_tickers.values()
    b = a[0]
    for i in range(1, len(a)):
        b.extend(a[i])
    return b

if __name__ == '__main__':
    all_symbols = get_sp500_all()
    all_symbols.sort()
    print len(all_symbols)


np.savetxt('all_symbols.csv',all_symbols, delimiter=',',fmt="%s")

然后将包含所有股票的csv加载到R中,在R中计算相切投资组合

library(PerformanceAnalytics)
library(zoo)
library(tseries)

#load all the equities in SP500, 
#file generate from python
# need transpose with t()
equity_list = t(read.csv("all_symbols.csv", header = FALSE, sep=","))

# start_date & end_date
start_date = "2012-10-30"
end_date = "2014-10-30"

# create a blank zoo type var
all_prices = zoo()

# download equity data from yahoo
for (i in 1:length(equity_list)){
  a = get.hist.quote(instrument = equity_list[i], start = start_date, end = end_date, 
                      quote = "AdjClose", provider = "yahoo", origin = "1970-01-01", 
                      compression = "m", retclass = "zoo")
  index(a) = as.yearmon(index(a))
  if (i == 1){all_prices = a} else {all_prices = merge(all_prices, a)
  all_prices}
  }

colnames(all_prices) = equity_list

# Calculate cc returns as difference in log prices
returns_df = diff(log(all_prices))

# remove the column which begining with NA
for (i in equity_list){
  if(is.na(returns_df[1, i])){returns_df = returns_df[, colnames(returns_df)!=i]}
}

# first forward fill the NA, then back foward fill the NA, 
#if there has long time NA, should delete the symbol, fix it in future
returns_df = na.locf(returns_df)
returns_df = na.locf(returns_df, fromLast = TRUE)

####################################################################################
# Parameters CER model
mu_hat_month = apply(returns_df, 2, mean)
sigma2_month = apply(returns_df, 2, var)
sigma_month = apply(returns_df, 2, sd)
cov_mat_month = var(returns_df)
cor_mat_month = cor(returns_df)

####################################################################################
#tangency portfolio
rf = 0.00001
mu2 = mu_hat_month - fr
tangency_portfolio = solve(cov_mat_month, mu2)

但是总是有错误

Error in solve.default(cov(returns_df), mu2) : 
  system is computationally singular: reciprocal condition number = 1.59968e-21

我也可以使用函数 tangency.portfolio.r

##Tangency portfolio

# download tangency.portfolio.r
# https://r-forge.r-project.org/scm/viewvc.php/pkg/IntroCompFinR/R/tangency.portfolio.R?view=markup&root=introcompfinr
source("D:\\MOOC\\compfinance\\excise\\tangency.portfolio.r")

#The tangency portfolio

# risk free rate
t_bill_rate = 0.00001
# Tangency portfolio short sales allowed
tangency_portfolio_short = tangency.portfolio(mu_hat_month, cov_mat_month, risk.free=t_bill_rate, shorts=TRUE)

有错误

 Error in chol.default(cov.mat) : 
  the leading minor of order 25 is not positive definite 

似乎returns_df中的数据有问题,但不知道哪里错了。任何人都可以帮忙吗?

4

2 回答 2

3

错误:系统在计算上是奇异的 => 这意味着您的设计矩阵不可逆

于 2014-10-31T11:59:23.567 回答
2

使用 ginv(),而不是 solve(),它可以求解可逆矩阵。

于 2015-01-21T06:14:30.153 回答