我的数据结构如下:
group_id, months_from_start, perc_total_downloads, experience_ratio
1 1 1.2 4
1 2 1.7 6
…
235 1 6.7 3
235 2 18 8
…
大约有 300 个组,每个组有 70 个左右的连续数据元素。
我发布了以下脚本来估计每个组的二阶多项式。
s.1<-lm(xts(s[s$group_id == 1,][,-2], order.by=as.Date(s[s$group_id == 1,][,2]))$perc_total_downloads ~ poly(xts(s[s$group_id == 1,][,-2], order.by=as.Date(s[s$group_id == 1,][,2]))$months_from_start, 2, raw=TRUE))
s.235<-lm(xts(s[s$group_id == 235,][,-2], order.by=as.Date(s[s$group_id == 235,][,2]))$perc_total_downloads ~ poly(xts(s[s$group_id == 235,][,-2], order.by=as.Date(s[s$group_id == 235,][,2]))$months_from_start, 2, raw=TRUE))
s.599<-lm(xts(s[s$group_id == 599,][,-2], order.by=as.Date(s[s$group_id == 599,][,2]))$perc_total_downloads ~ poly(xts(s[s$group_id == 599,][,-2], order.by=as.Date(s[s$group_id == 599,][,2]))$months_from_start, 2, raw=TRUE))
s.1111<-lm(xts(s[s$group_id == 1111,][,-2], order.by=as.Date(s[s$group_id == 1111,][,2]))$perc_total_downloads ~ poly(xts(s[s$group_id == 1111,][,-2], order.by=as.Date(s[s$group_id == 1111,][,2]))$months_from_start, 2, raw=TRUE))
s.1537<-lm(xts(s[s$group_id == 1537,][,-2], order.by=as.Date(s[s$group_id == 1537,][,2]))$perc_total_downloads ~ poly(xts(s[s$group_id == 1537,][,-2], order.by=as.Date(s[s$group_id == 1537,][,2]))$months_from_start, 2, raw=TRUE))
对于这些新变量中的每一个,我都可以发布一个摘要声明来揭示有趣的信息:
> summary(s.44375)
Call:
lm(formula = xts(s[s$group_id == 44375, ][, -2], order.by = as.Date(s[s$group_id ==
44375, ][, 2]))$perc_total_downloads ~ poly(xts(s[s$group_id ==
44375, ][, -2], order.by = as.Date(s[s$group_id == 44375,
][, 2]))$months_from_start, 2, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.0064004 -0.0017315 -0.0002022 0.0012087 0.0078436
Coefficients: (3 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.993e-03 1.137e-03 1.753 0.084 .
poly(xts(s[s$group_id == 44375, ][, -2], order.by = as.Date(s[s$group_id == 44375, ][, 2]))$months_from_start, 2, raw = TRUE)1.0 7.769e-04 6.707e-05 11.583 <2e-16 ***
poly(xts(s[s$group_id == 44375, ][, -2], order.by = as.Date(s[s$group_id == 44375, ][, 2]))$months_from_start, 2, raw = TRUE)2.0 -9.258e-06 8.404e-07 -11.017 <2e-16 ***
poly(xts(s[s$group_id == 44375, ][, -2], order.by = as.Date(s[s$group_id == 44375, ][, 2]))$months_from_start, 2, raw = TRUE)0.1 NA NA NA NA
poly(xts(s[s$group_id == 44375, ][, -2], order.by = as.Date(s[s$group_id == 44375, ][, 2]))$months_from_start, 2, raw = TRUE)1.1 NA NA NA NA
poly(xts(s[s$group_id == 44375, ][, -2], order.by = as.Date(s[s$group_id == 44375, ][, 2]))$months_from_start, 2, raw = TRUE)0.2 NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.002866 on 69 degrees of freedom
Multiple R-squared: 0.6619,Adjusted R-squared: 0.6521
F-statistic: 67.53 on 2 and 69 DF, p-value: < 2.2e-16
出于我的目的,我需要将此信息转录成表格,从这种格式剪切和粘贴非常繁琐且耗时:
group_id intercept est intercept stnd err intercept t value …
44375 1.993e-03 1/137e-03 1.753 ...
…
使用传统记数法而不是科学记数法对我来说也很方便,但我想我可以没有它。
我有什么办法可以做到这一点而无需手动剪切和粘贴?
谢谢--sw