4

R 包“termstrc”,专为术语结构估计而设计,是一个非常有用的工具,但它需要以一种特别尴尬的格式设置数据:列表中的列表。

问题:为了创建运行函数“dyncouponbonds”所需的重复子列表格式,在 R 外部或 R 内部准备和塑造数据的最佳方法是什么?

“dyncouponbonds”命令要求将数据设置在重复的子列表中,其中债券列表和这些债券的时间不变特征(我们称之为“债券列表”)附加了这些债券的一些时间 t 特征(价格和应计利息),并在时间 t+1 到 T 复制。

下面是一个时期的列表格式示例。“dyncouponbonds”命令要求在一个总括列表中为所有 T 个周期复制此格式。ISIN, MATURITYDATE, ISSUEDATE, COUPONRATE 在每个期间都是相同的。每个时期的价格、应计、现金流和今天都会有所不同。

R> str(govbonds$GERMANY)

List of 8
$ ISIN : chr [1:52] "DE0001141414" "DE0001137131" "DE0001141422" ...
$ MATURITYDATE:Class 'Date' num [1:52] 13924 13952 13980 14043 ...
$ ISSUEDATE :Class 'Date' num [1:52] 11913 13215 12153 13298 ...
$ COUPONRATE : num [1:52] 0.0425 0.03 0.03 0.0325 ...
$ PRICE : num [1:52] 100 99.9 99.8 99.8 ...
$ ACCRUED : num [1:52] 4.09 2.66 2.43 2.07 ...
$ CASHFLOWS :List of 3
..$ ISIN: chr [1:384] "DE0001141414" "DE0001137131" "DE0001141422" ...
..$ CF : num [1:384] 104 103 103 103 ...
..$ DATE:Class 'Date' num [1:384] 13924 13952 13980 14043 ...
$ TODAY :Class 'Date' num 13908
4

2 回答 2

4

This a fairly advanced data manipulation question. R has many powerful data manipulation tools and you're not going to need to move away from R to prepare the (admittedly fairly obtuse) dyncouponbonds object. Indeed you actually shouldn't, because taking a structure from another language and then turning into dyncouponbonds will simply be more work.

The first thing I would make sure is that you are very familiar with the lapply function. You're going to be making plenty of use of it. You're going to be using it to create a list of couponbonds objects, which is what dyncouponbonds actually is. Creating couponbonds objects however is a little tougher, mainly because of the CASHFLOWS sublist which wants each cashflow associated with the bond's ISIN and with the date of the cashflow. For this you'll use lapply and some fairly advanced subscripting. The subset function will also come in handy.

This question also very much depends on where you will be getting the data from, and getting it out of Bloomberg is non-trivial, mainly because you will need to go back in history using the BDS function and "DES_CASH_FLOW" field for each bond to get its cashflows. I say history, because if you're using dyncouponbonds I'm assuming you will want to do historic yield curve analysis. You'll need to override the BDS function's "SETTLE_DT" field, to the value that you will have received for the bond using the BDP function and field "FIRST_SETTLE_DT", so that you get all the cashflows from the beginning of the bond's life (otherwise it'll only return from today, and that's no good for historic analysis). But I digress. If you're not using bloomberg I don't know where you'll get this data from.

You'll then need to get the static data for each bond, namely the maturity, the ISIN, and the coupon rate and the issue date. And you'll need historic price and accrued interest data. Again if using bloomberg, you'll use the BDP function for this with fields you'll see in the code, below, and the historic data function BDH which I have wrapped as bbdh. Assuming again that you're a bloomberg user, here is the code:

bbGetCountry <- function(cCode, up = FALSE) {
# this function is going to get all the data out of bloomberg that we need for a
# country, and update it if ncessary
    if (up == TRUE) startDate <- as.Date("2012-01-01") else startDate <- histStartDate 
    # first get all the curve members for history
    wdays <- wdaylist(startDate, Sys.Date()) # create the list of working days from startdate
    actives <- lapply(wdays, function(x) { 
        bds(conn, BBcurveIDs[cCode], "CURVE_MEMBERS", override_fields = "CURVE_DATE",
        override_values = format(x, "%Y%m%d"))
    })
    names(actives) <- wdays
    uniqueActives <- unique(unlist(actives)) # there will be puhlenty duplicates. Get rid of them
    # now get the unchanging bond data
    staticData <- bdp(conn, uniqueActives, bbStaticDataFields)
    # now get the cash flowdata
    cfData <- lapply(uniqueActives, function(x) {
        bds(conn, x, "DES_CASH_FLOW_ADJ", override_fields = "SETTLE_DT", 
            override_values = format(as.Date(staticData[x, "FIRST_SETTLE_DT"]), "%Y%m%d"))
    })
    names(cfData) <- uniqueActives
    # now for historic data
    historicData <- lapply(bbHistoricDataFields, function(x) bbdh(uniqueActives, flds = x, startDate = startDate))
    names(historicData) <- bbHistoricDataFields   # put the names in otherwise we get a numbered list
    allDates <- as.Date(index(historicData$LAST_PRICE)) # all the dates we will find settlement dates for for all bonds. No posix
    save(actives, file = paste("data/", cCode, "actives.dat", sep = ""))      #save all the files now
    save(staticData, file = paste("data/", cCode, "staticData.dat", sep = ""))
    save(cfData, file = paste("data/", cCode, "cfData.dat", sep = ""))
    save(historicData, file = paste("data/", cCode, "historicData.dat", sep = ""))
    #save(settleDates, file = paste("data/", cCode, "settleDates.dat", sep = ""))
    assign(paste(cCode, "data", sep = ""), list(actives = actives, staticData = staticData, cfData = cfData,    #
        historicData = historicData), pos = 1)

}

the bbdh function I use above is wrapper around the Rbbg library's bdh function and looks like this:

bbdh <- function(secs, years = 1, flds = "last_price", startDate = NULL) {
        #this function gets secs over years from bloomberg daily data
            if(is.null(startDate)) startDate <- Sys.Date() - years * 365.25
            if(class(startDate) == "Date") stardDate <- format(startDate, "%Y%m%d") #convert date classes to bb string
            if(nchar(startDate) > 8) startDate <- format(as.Date(startDate), "%Y%m%d") # if we've been passed wrong format character string 
            rawd <- bdh(conn, secs, flds, startDate, always.display.tickers = TRUE, include.non.trading.days = TRUE,
                option_names = c("nonTradingDayFillOption", "nonTradingDayFillMethod"),
                option_values = c("NON_TRADING_WEEKDAYS", "PREVIOUS_VALUE"))
            rawd <- dcast(rawd, date ~ ticker) #put into columns
            colnames(rawd) <- sub(" .*", "", colnames(rawd)) #remove the govt, currncy bits from bb tickers
            return(xts(rawd[, -1], order.by = as.POSIXct(rawd[, 1])))
        }

The country code comes from a structure which associates two letter names with bloomberg yield curve descriptions:

BBcurveIDs  <- list(PO = "YCGT0084 Index", #Portugal
                    DE = "YCGT0016 Index", 
                    FR = "YCGT0014 Index", 
                    SP = "YCGT0061 Index",
                    IT = "YCGT0040 Index",
                    AU = "YCGT0001 Index", #Australia
                    AS = "YCGT0063 Index", #Austria
                    JP = "YCGT0018 Index",
                    GB = "YCGT0022 Index",
                    HK = "YCGT0095 Index",
                    CA = "YCGT0007 Index",
                    CH = "YCGT0082 Index",
                    NO = "YCGT0078 Index",
                    SE = "YCGT0021 Index",
                    IR = "YCGT0062 Index",
                    BE = "YCGT0006 Index",
                    NE = "YCGT0020 index", 
                    ZA = "YCGT0090 Index",
                    PL = "YCGT0177 Index", #Poland
                    MX = "YCGT0251 Index")

So bbGetCountry will create 4 different data structures, called actives, staticData, dynamicData, and historicData, all from the following bloomberg fields:

bbStaticDataFields <- c("ID_ISIN",
                      "ISSUER", 
                      "COUPON",
                      "CPN_FREQ",
                      "MATURITY",
                      "CALC_TYP_DES",                    # pricing calculation type 
                      "INFLATION_LINKED_INDICATOR",     # N or Y, in R returned as TRUE or FALSE
                      "ISSUE_DT",
                      "FIRST_SETTLE_DT",
                      "PX_METHOD",                      # PRC or YLD 
                      "PX_DIRTY_CLEAN",                 # market convention dirty or clean
                      "DAYS_TO_SETTLE",
                      "CALLABLE",
                      "MARKET_SECTOR_DES",
                      "INDUSTRY_SECTOR",
                      "INDUSTRY_GROUP",
                      "INDUSTRY_SUBGROUP")

bbDynamicDataFields <- c("IS_STILL_CALLABLE",
                        "RTG_MOODY",
                        "RTG_MOODY_WATCH",
                        "RTG_SP",
                        "RTG_SP_WATCH",
                        "RTG_FITCH",
                        "RTG_FITCH_WATCH")

bbHistoricDataFields <- c("PX_BID",
                          "PX_ASK",
                          #"PX_CLEAN_BID",
                          #"PX_CLEAN_ASK",
                          "PX_DIRTY_BID",
                          "PX_DIRTY_ASK",
                          #"ASSET_SWAP_SPD_BID",
                          #"ASSET_SWAP_SPD_ASK",
                          "LAST_PRICE",
                          #"SETTLE_DT",
                          "YLD_YTM_MID")

Now you're ready to create couponbond objects, using all these data structures:

createCouponBonds <- function(cCode, dateString) {
    cdata <- get(paste(cCode, "data", sep = "")) # get the data set
    today <- as.Date(dateString)
    settleDate <- today
    daycount <- 0
    while(daycount < 3) {
        settleDate <- settleDate + 1
        if (!(weekdays(settleDate) %in% c("Saturday", "Sunday"))) daycount <- daycount + 1
    }
    goodbonds <- subset(cdata$staticData, COUPON != 0 & INFLATION_LINKED_INDICATOR == FALSE) # clean out zeros and tbills
    goodbonds <- goodbonds[rownames(goodbonds) %in% cdata$actives[[dateString]][, 1], ]
    stripnames <- sapply(strsplit(rownames(goodbonds), " "), function(x) x[1])
    pxbid <- cdata$historicData$PX_BID[today, stripnames]
    pxask <- cdata$historicData$PX_ASK[today, stripnames]
    pxdbid <- cdata$historicData$PX_DIRTY_BID[today, stripnames]
    pxdask <- cdata$historicData$PX_DIRTY_ASK[today, stripnames]
    price <- as.numeric((pxbid + pxask) / 2)
    accrued <- as.numeric(pxdbid - pxbid)
    cashflows <- lapply(rownames(goodbonds), function(x) {
        goodflows <- cdata$cfData[[x]][as.Date(cdata$cfData[[x]][, "Date"]) >= today, ]
        #gfstipnames <- sapply(strsplit(rownames(goodflows), " "), function(x) x[1]) dunno if I need this
        isin <- rep(cdata$staticData[x, "ID_ISIN"], nrow(goodflows))
        cf <- apply(goodflows[, 2:3], 1, sum) / 10000
        dt <- as.Date(goodflows[, 1])
        return(list(isin = isin, cf = cf, dt = dt))
    })
    isinvec <- unlist(lapply(cashflows, function(x) x$isin))
    cfvec <- as.numeric(unlist(lapply(cashflows, function(x) x$cf)))
    datevec <- unlist(lapply(cashflows, function(x) x$dt))
    govbonds <- list(ISIN = goodbonds$ID_ISIN, 
                     MATURITYDATE = as.Date(goodbonds$MATURITY),
                     ISSUEDATE = as.Date(goodbonds$FIRST_SETTLE_DT),
                     COUPONRATE = as.numeric(goodbonds$COUPON) / 100,
                     PRICE = price,
                     ACCRUED = accrued,
                     CASHFLOWS = list(ISIN = isinvec, CF = cfvec, DATE = as.Date(datevec)),
                     TODAY = settleDate)
    govbonds <- list(govbonds)
    names(govbonds) <- cCode
    class(govbonds) <- "couponbonds"
    return(govbonds)
}

Take a close look at the cashflows <- lapply... function because this is where you'll create the sublist and is the core of the answer to your question, although of course, how this is done depends very much on how you have decided to build the intermediate data structures, and I have given you just one possibility. I realise that my answer is complex, but the problem is very complex. All the code you need is not in this answer either, a few helper functions are missing, but I am happy to provide them if you contact me. Certainly the skeleton of the core functions is all here, and actually, much of the problem is getting the data in the first place, and structuring it appropriately. You correctly surmise that some of the data is static for each bond, some of it is dynamic, and some of it is historical. So the dimensions of the intermediate datas structures are different for different pieces of the couponbonds objects. How you represent that is up to you, though I have used separate lists / data frames for each, linked via the bond IDs where necessary.

The function above will take a date string so you can do it for each of your historic data points, using the above-mentioned lapply, and hey "presto", dyncouponds:

spl <<- lapply(dodates, function(x) createCouponBonds("SP", x))
    names(spl) <<- lapply(spl, function(x) x$SP$TODAY)
    class(spl) <- "dyncouponbonds"

There you go. You asked for it....

If you're not using bloomberg, your input data structures will be very different but, as I said starting out, get super familiar with lapply and sapply. OBviously there are many other ways this problem could be solved, but the above works for Bloomberg. If you understand this code, you'll surely know what you're doing for other data sources.

Finally please note that the Rbbg package from findata.org is used to interface to bloomberg.

于 2013-02-13T20:44:21.030 回答
0

我的 2 美分,我一直在尝试用 new 来完成这项工作Rblpapi。我仍然有一些问题,createCouponBonds但我认为其他功能可以正确返回。不会解决整个问题,但至少可以解决部分问题。BBcurveIDs, bbStaticDataFields, bbDynamicDataFields, bbHistoricDataFields与上述相同。

bbGetCountry <- function(cCode, up = FALSE) {
  if (up == TRUE) startDate <- as.Date("2016-01-01") else startDate <- histStartDate 
  cal <- Calendar(weekdays=c("saturday", "sunday"))
  wdays <- as.list(bizseq(startDate, Sys.Date(), cal))
  actives <- lapply(wdays, function(x) { 
    bds(BBcurveIDs[cCode][[1]], "CURVE_MEMBERS", override = c(CURVE_DATE=format(x, "%Y%m%d")))
  })
  names(actives) <- wdays
  uniqueActives <- unique(unlist(actives))
  staticData <- bdp(uniqueActives, bbStaticDataFields)
  cfData <- lapply(uniqueActives, function(x) {
    bds(x, "DES_CASH_FLOW_ADJ", override = c(SETTLE_DT = format(as.Date(staticData[x, "FIRST_SETTLE_DT"]), "%Y%m%d")))
  })
  names(cfData) <- uniqueActives

  historicData <- lapply(bbHistoricDataFields, function(x) bbdh(uniqueActives, flds = x, startDate = startDate))
  names(historicData) <- bbHistoricDataFields
  allDates <- as.Date(index(historicData$LAST_PRICE))

  save(actives, file = paste("data_", cCode, "actives.dat", sep = ""))
  save(staticData, file = paste("data_", cCode, "staticData.dat", sep = ""))
  save(cfData, file = paste("data_", cCode, "cfData.dat", sep = ""))
  save(historicData, file = paste("data_", cCode, "historicData.dat", sep = ""))
  #save(settleDates, file = paste("data_", cCode, "settleDates.dat", sep = ""))
  assign(paste(cCode, "data", sep = ""), list(actives = actives, staticData = staticData, cfData = cfData,    #
                                              historicData = historicData), pos = 1)

}

和 bbdh 功能:

bbdh <- function(secs, years = 1, flds = "last_price", startDate = NULL) {
  if(is.null(startDate)) startDate <- Sys.Date() - years * 365.25
  if(class(startDate) == "Date") stardDate <- format(startDate, "%Y%m%d")
  if(nchar(startDate) > 8) startDate <- format(as.Date(startDate), "%Y%m%d")
  rawd <- bdh(secs, flds, 
              startDate, 
              include.non.trading.days = FALSE,
              options = structure(c("PREVIOUS_VALUE", "NON_TRADING_WEEKDAYS"),
                                  names = c("nonTradingDayFillMethod","nonTradingDayFillOption")))
  rawd <- ldply(rawd, data.frame)
  colnames(rawd) <- c("sec", "date", "fld")
  rawd <- dcast(rawd, date ~ sec, value.var="fld")
  colnames(rawd) <- gsub(" Corp", "", colnames(rawd))
  return(xts(rawd[,-1], order.by=rawd[,1]))
}
于 2016-05-14T21:17:33.047 回答