2

我遇到了一个棘手的问题,试图在单个数据集中多次解决由趋势引起的方差.....

我的数据结构是这样的

x <- read.table(text = "
STA YEAR    VALUE
a   1968    457
a   1970    565
a   1972    489
a   1974    500
a   1976    700
a   1978    650
a   1980    659
b   1968    457
b   1970    565
b   1972    350
b   1974    544
b   1976    678
b   1978    650
b   1980    690
c   1968    457
c   1970    565
c   1972    500
c   1974    600
c   1976    678
c   1978    670
c   1980    750 " , header = T)    

我正在尝试返回这样的东西

STA  R-sq
a    n1
b    n2
c    n3

其中 n# 是原始集中位置数据的对应 r 平方值....

我努力了

fit <- lm(VALUE ~ YEAR + STA, data = x) 

给出多年来每个站点的 VALUE 年度趋势模型,数据可用于 VALUE,在主数据集中....

任何帮助将不胜感激......我真的很难过这个问题,我知道这只是对 R 问题的熟悉。

4

3 回答 3

2

要为每组的VALUE~获得 r 平方,您可以采用这个先前的答案,稍微修改它并插入您的值:YEARSTA

# assuming x is your data frame (make sure you don't have Hmisc loaded, it will interfere)
models_x <- dlply(x, "STA", function(df) 
     summary(lm(VALUE ~ YEAR, data = df)))

# extract the r.squared values
rsqds <- ldply(1:length(models_x), function(x) models_x[[x]]$r.squared)
# give names to rows and col
rownames(rsqds) <- unique(x$STA)
colnames(rsqds) <- "rsq"
# have a look
rsqds
        rsq
a 0.6286064
b 0.5450413
c 0.8806604

编辑:这里遵循 mnel 的建议是将 r 平方值放入一个漂亮的表中的更有效方法(无需添加行和列名称):

# starting with models_x from above
rsqds <- data.frame(rsq =sapply(models_x, '[[', 'r.squared'))

# starting with just the original data in x, this is great:
rsqds  <- ddply(x, "STA", summarize, rsq = summary(lm(VALUE ~ YEAR))$r.squared)

  STA       rsq
1   a 0.6286064
2   b 0.5450413
3   c 0.8806604
于 2013-02-05T22:56:06.657 回答
1
    #first load the data.table package 
        library(data.table)
    #transform your dataframe to a datatable (I'm using your example)
        x<- as.data.table(x)
    #calculate all the metrics needed (r^2, F-distribution and so on) 
        x[,list(r2=summary(lm(VALUE~YEAR))$r.squared ,
        f=summary(lm(VALUE~YEAR))$fstatistic[1] ),by=STA]
           STA        r2         f
        1:   a 0.6286064  8.462807
        2:   b 0.5450413  5.990009
        3:   c 0.8806604 36.897258
于 2015-11-27T14:30:15.237 回答
0

只有一个 r 平方值,而不是三个 .. 请编辑您的问题

# store the output 
y <- summary( lm( VALUE ~ YEAR + STA , data = x ) )
# access the attributes of `y`
attributes( y )
y$r.squared
y$adj.r.squared
y$coefficients
y$coefficients[,1]

# or are you looking to run three separate
# lm() functions on 'a' 'b' and 'c' ..where this would be the first? 
y <- summary( lm( VALUE ~ YEAR , data = x[ x$STA %in% 'a' , ] ) )
# access the attributes of `y`
attributes( y )
y$r.squared
y$adj.r.squared
y$coefficients
y$coefficients[,1]
于 2013-02-05T21:14:42.957 回答