2

R 纽布。我的数据的小代表。

TeamHome <- c("LAL", "HOU", "SAS", "LAL")
TeamAway <- c("IND", "SAS", "LAL", "HOU")
df <- data.frame(cbind(TeamHome, TeamAway))
df

   TeamHome TeamAway
     LAL      IND
     HOU      SAS
     SAS      LAL
     LAL      HOU

想象一下,这是一个有数千场比赛的赛季的前四场比赛。对于主队和客队,我想计算主场、客场和总数的累计比赛数。因此,主队和客队都有 3 个新栏目。我想得到这样的东西(在这种情况下,我只计算主队的新变量):

    TeamHome TeamAway HomeTeamGamesPlayedatHome HomeTeamGamesPlayedRoad HomeTeamTotalgames
1      LAL      IND                         1                       0                  1
2      HOU      SAS                         1                       0                  1
3      SAS      LAL                         1                       1                  2
4      LAL      HOU                         2                       1                  3

为了计算第一列(HomeTeamGamesPlayedatHome),我设法做到了:

df$HomeTeamGamesPlayedatHome <- ave(df$TeamHome==df$TeamHome, df$TeamHome, FUN=cumsum)

但是感觉太复杂了,我也无法用这种方法计算其他列。

我还想过使用公式表来计算出现次数:

 table(df$TeamHome)

但它只是计算总数,我想要任何给定时间点的结果。谢谢!

4

2 回答 2

2
library(dplyr)
df <- df %>% group_by(TeamHome) %>% 
  mutate(HomeGames = seq_along(TeamHome))
lst <- list()
for(i in 1:nrow(df)) lst[[i]] <- sum(df$TeamAway[1:i] == df$TeamHome[i])
df$HomeTeamGamesPlayedRoad <- unlist(lst)
df %>% mutate(HomeTeamTotalgames = HomeGames+HomeTeamGamesPlayedRoad)
  TeamHome TeamAway HomeGames HomeTeamGamesPlayedRoad HomeGames
1      LAL      IND         1                       0         1
2      HOU      SAS         1                       0         1
3      SAS      LAL         1                       1         2
4      LAL      HOU         2                       1         3

HomeGamesseq_along通过逐行迭代创建的。HomeTeamGamesPlayedRoad是通过一个循环检查当前游戏中的值(TeamAway包括当前游戏)创建的。最后一行是其他两个创建的总和。

于 2015-08-25T00:25:58.273 回答
1

循环解决方案:

TeamHome <- c("LAL", "HOU", "SAS", "LAL")
TeamAway <- c("IND", "SAS", "LAL", "HOU")
df <- data.frame(TeamHome,TeamAway,HomeTeamGamesPlayedatHome=ave(TeamHome==TeamHome, TeamHome, FUN=cumsum))

for (i in 1:nrow(df)) {
        curdf<-df[1:i,];v<-ave(curdf$TeamAway==as.character(curdf$TeamHome[i]), curdf$TeamAway, FUN=cumsum)
        df$HomeTeamGamesPlayedRoad[i] <- sum(v)
}
df$HomeTeamTotalgames <- df$HomeTeamGamesPlayedatHome + df$HomeTeamGamesPlayedRoad

      TeamHome TeamAway HomeTeamGamesPlayedatHome HomeTeamGamesPlayedRoad HomeTeamTotalgames
1      LAL      IND                         1                       0                  1
2      HOU      SAS                         1                       0                  1
3      SAS      LAL                         1                       1                  2
4      LAL      HOU                         2                       1                  3
于 2015-08-25T00:19:20.890 回答