r - 如何对每个季度和行业进行横截面回归？

Question

我需要为每个行业-季度组合运行多个回归，例如对每个时间段的金融公司进行回归，例如 1999Q3、1999Q4、2000Q1、2000Q2 等，还有公用事业公司和每家零售公司以及每家食品公司等.

我需要运行回归，然后将回归中的所有系数收集到一个列表中，然后我可以将列表附加回原始数据框，以便获得相应的系数。

例如在下面的数据集中，我想运行回归 Y = x1 + x2 + x3，我尝试使用 for 循环和嵌套循环并将系数收集到矩阵中，但我似乎无法让它工作（我是 R 新手！）

我有一个面板数据集，其中包含公司名称、行业、日历季度和一些变量，如下所示：

              `Company Name`  Industry  Quater           Y                x1               x2               x3
               <chr>          <chr>     <chr>            <dbl>            <dbl>            <dbl>            <dbl>            
              A & M FOOD SE  Food       1985Q1           2.97             16.4             9.23             2.22              
              A & M FOOD SE  Food       1985Q2           5.00             40.2             11.2             3.94              
              A & M FOOD SE  Food       1985Q3           5.71             40.7             12.5             4.66              
              A & M FOOD SE  Food       1985Q4           3.85             39.5             13.0             2.79              
              A & M FOOD SE  Food       1986Q1           3.12             38.9             13.2             1.98              
              A.A. IMPORTIN  Food       1985Q4           12.5             14.0             6.66             0.005             
              A.A. IMPORTIN  Food       1986Q1           13.3             15.0             6.74             0.513              
              A.A. IMPORTIN  Food       1986Q2           13.2             15.0             6.71             0.031             
              A.A. IMPORTIN  Food       1986Q3           13.5             15.2             6.86             0.111             
              C.D. JUMPINGS  Retail     1986Q4           13.1             14.6             7.46             0.241
              C.D. JUMPINGS  Retail     1985Q4           12.5             14.0             6.66             0.005             
              C.D. JUMPINGS  Retail     1986Q1           13.3             15.0             6.74             0.513              
              C.D. JUMPINGS  Retail     1986Q2           13.2             15.0             6.71             0.031             
              Kmart          Retail     1986Q3           13.5             15.2             6.86             0.111
              Kmart          Retail     1986Q4           13.1             14.6             7.46             0.241
              Kmart          Retail     1985Q4           12.5             14.0             6.66             0.005             
              Kmart          Retail     1986Q1           13.3             15.0             6.74             0.513              
              Kmart          Retail     1986Q2           13.2             15.0             6.71             0.031             
              Kmart          Retail     1986Q3           13.5             15.2             6.86             0.111

非常感谢你们，我尝试使用 plm 库中的奇怪函数来 lapply。

score 1 · Accepted Answer

一个简单的基本 R 方法是使用split. 根据第二个参数的级别将第一个参数中的 asplit划分data.frame为 s 列表。data.frame因此，使用您的 sample data，split(data,data$`Company Name`)将产生一个 4 data.frames 的列表。

从那里，我们可以使用lapply将该lm函数应用于该数据子集。因为有很多参数，所以定义一个新函数（称为 lambda 函数）lm会更容易。x

lapply(split(data,data$`Company Name`),
       function(x) lm( Y ~ x1 + x2 + x3, data = x))

格式有点乱，所以你可以用它sapply来简化结果。

t(sapply(split(data,data$`Company Name`),
         function(x) lm( Y ~ x1 + x2 + x3, data = x)$coefficients
         )
  )
#              (Intercept)         x1            x2        x3
#A & M FOOD SE   0.5773632 0.01586041 -3.662652e-05 0.9607874
#A.A. IMPORTIN  -3.6117236 0.64295788  1.067509e+00 0.1410264
#C.D. JUMPINGS   1.7123480 0.68601589  1.775447e-01 0.1964184
#Kmart           0.2591970 0.78346288  1.880233e-01 0.0525099

如果您想对两个变量执行此操作，Company Name只需Quarter提供一个listto split。

t(sapply(split(data,list(data$`Company Name`, data$Quater)),
         function(x) lm( Y ~ x1 + x2 + x3, data = x)$coefficients
         )
  )

我无法提供输出，因为其中许多是空的。希望您的数据集是完整的。它应该看起来像这样：

t(sapply(Filter(function(x) nrow(x) > 0, split(data,list(data$`Company Name`, data$Quater))),
          function(x) lm( Y ~ x1 + x2 + x3, data = x)$coefficients
          )
   )
#                     (Intercept) x1 x2 x3
#A & M FOOD SE.1985Q1        2.97 NA NA NA
#A & M FOOD SE.1985Q2        5.00 NA NA NA
#A & M FOOD SE.1985Q3        5.71 NA NA NA
#A & M FOOD SE.1985Q4        3.85 NA NA NA
#A.A. IMPORTIN.1985Q4       12.50 NA NA NA
#C.D. JUMPINGS.1985Q4       12.50 NA NA NA
#Kmart.1985Q4               12.50 NA NA NA
#A & M FOOD SE.1986Q1        3.12 NA NA NA
#A.A. IMPORTIN.1986Q1       13.30 NA NA NA
#C.D. JUMPINGS.1986Q1       13.30 NA NA NA
#Kmart.1986Q1               13.30 NA NA NA
#A.A. IMPORTIN.1986Q2       13.20 NA NA NA
#C.D. JUMPINGS.1986Q2       13.20 NA NA NA
#Kmart.1986Q2               13.20 NA NA NA
#A.A. IMPORTIN.1986Q3       13.50 NA NA NA
#Kmart.1986Q3               13.50 NA NA NA
#C.D. JUMPINGS.1986Q4       13.10 NA NA NA
#Kmart.1986Q4               13.10 NA NA NA

数据

data <- structure(list(`Company Name` = structure(c(1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A & M FOOD SE", 
"A.A. IMPORTIN", "C.D. JUMPINGS", "Kmart"), class = "factor"), 
    Industry = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Food", 
    "Retail"), class = "factor"), Quater = structure(c(1L, 2L, 
    3L, 4L, 5L, 4L, 5L, 6L, 7L, 8L, 4L, 5L, 6L, 7L, 8L, 4L, 5L, 
    6L, 7L), .Label = c("1985Q1", "1985Q2", "1985Q3", "1985Q4", 
    "1986Q1", "1986Q2", "1986Q3", "1986Q4"), class = "factor"), 
    Y = c(2.97, 5, 5.71, 3.85, 3.12, 12.5, 13.3, 13.2, 13.5, 
    13.1, 12.5, 13.3, 13.2, 13.5, 13.1, 12.5, 13.3, 13.2, 13.5
    ), x1 = c(16.4, 40.2, 40.7, 39.5, 38.9, 14, 15, 15, 15.2, 
    14.6, 14, 15, 15, 15.2, 14.6, 14, 15, 15, 15.2), x2 = c(9.23, 
    11.2, 12.5, 13, 13.2, 6.66, 6.74, 6.71, 6.86, 7.46, 6.66, 
    6.74, 6.71, 6.86, 7.46, 6.66, 6.74, 6.71, 6.86), x3 = c(2.22, 
    3.94, 4.66, 2.79, 1.98, 0.005, 0.513, 0.031, 0.111, 0.241, 
    0.005, 0.513, 0.031, 0.111, 0.241, 0.005, 0.513, 0.031, 0.111
    )), class = "data.frame", row.names = c(NA, -19L))

score 1 · Accepted Answer

一种方法broom：

library(modelr)
library(tidyverse)
library(broom)

nested <- df %>% 
  group_by(Company.Name, Quater) %>% 
  nest()

#specify regression
country_model <- function(df) {
  lm(Y ~ x1 + x2 + x3, data = df)
}

#unnest coefficients (only Intercept and one x1 here because most are NA)
nested %>% 
  mutate(model = map(data, country_model),
         tidy = map(model, broom::tidy)) %>% 
  unnest(tidy)

# A tibble: 19 x 9
# Groups:   Company.Name, Quater [18]
   Company.Name  Quater data             model  term        estimate std.error statistic p.value
   <chr>         <chr>  <list>           <list> <chr>          <dbl>     <dbl>     <dbl>   <dbl>
 1 A & M FOOD SE 1985Q1 <tibble [1 x 5]> <lm>   (Intercept) 2.97e+ 0       NaN       NaN     NaN
 2 A & M FOOD SE 1985Q2 <tibble [1 x 5]> <lm>   (Intercept) 5.00e+ 0       NaN       NaN     NaN
 3 A & M FOOD SE 1985Q3 <tibble [1 x 5]> <lm>   (Intercept) 5.71e+ 0       NaN       NaN     NaN
 4 A & M FOOD SE 1985Q4 <tibble [1 x 5]> <lm>   (Intercept) 3.85e+ 0       NaN       NaN     NaN
 5 A & M FOOD SE 1986Q1 <tibble [1 x 5]> <lm>   (Intercept) 3.12e+ 0       NaN       NaN     NaN
 6 A.A. IMPORTIN 1985Q4 <tibble [1 x 5]> <lm>   (Intercept) 1.25e+ 1       NaN       NaN     NaN
 7 A.A. IMPORTIN 1986Q1 <tibble [1 x 5]> <lm>   (Intercept) 1.33e+ 1       NaN       NaN     NaN
 8 A.A. IMPORTIN 1986Q2 <tibble [1 x 5]> <lm>   (Intercept) 1.32e+ 1       NaN       NaN     NaN
 9 A.A. IMPORTIN 1986Q3 <tibble [1 x 5]> <lm>   (Intercept) 1.35e+ 1       NaN       NaN     NaN
10 C.D. JUMPINGS 1986Q4 <tibble [1 x 5]> <lm>   (Intercept) 1.31e+ 1       NaN       NaN     NaN
11 C.D. JUMPINGS 1985Q4 <tibble [1 x 5]> <lm>   (Intercept) 1.25e+ 1       NaN       NaN     NaN
12 C.D. JUMPINGS 1986Q1 <tibble [1 x 5]> <lm>   (Intercept) 1.33e+ 1       NaN       NaN     NaN
13 C.D. JUMPINGS 1986Q2 <tibble [1 x 5]> <lm>   (Intercept) 1.32e+ 1       NaN       NaN     NaN
14 Kmart         1986Q3 <tibble [2 x 5]> <lm>   (Intercept) 1.35e+ 1       NaN       NaN     NaN
15 Kmart         1986Q3 <tibble [2 x 5]> <lm>   x1          3.81e-16       NaN       NaN     NaN
16 Kmart         1986Q4 <tibble [1 x 5]> <lm>   (Intercept) 1.31e+ 1       NaN       NaN     NaN
17 Kmart         1985Q4 <tibble [1 x 5]> <lm>   (Intercept) 1.25e+ 1       NaN       NaN     NaN
18 Kmart         1986Q1 <tibble [1 x 5]> <lm>   (Intercept) 1.33e+ 1       NaN       NaN     NaN
19 Kmart         1986Q2 <tibble [1 x 5]> <lm>   (Intercept) 1.32e+ 1       NaN       NaN     NaN

r - 如何对每个季度和行业进行横截面回归？

2 回答 2

Related

Reference