1

I am trying to get an lm fit for my data. The problem I am having is that I want to fit a linear model(1st order polynomial) when the factor is "true" and a second order polynomial when the factor is "false". How can I get that done using only one lm.

a=c(1,2,3,4,5,6,7,8,9,10)
b=factor(c("true","false","true","false","true","false","true","false","true","false"))
c=c(10,8,20,15,30,21,40,25,50,31)
DumbData<-data.frame(cbind(a,c))
DumbData<-cbind(DumbData,b=b)

I have tried

Lm2<-lm(c~a + b + b*I(a^2), data=DumbData)
summary(Lm2)

that results in:

summary(Lm2)
Call:
lm(formula = c ~ a + b + b * I(a^2), data = DumbData)

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -0.74483    1.12047  -0.665 0.535640    
a             4.44433    0.39619  11.218 9.83e-05 ***
btrue         6.78670    0.78299   8.668 0.000338 ***
I(a^2)       -0.13457    0.03324  -4.049 0.009840 ** 
btrue:I(a^2)  0.18719    0.01620  11.558 8.51e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.7537 on 5 degrees of freedom
Multiple R-squared: 0.9982, Adjusted R-squared: 0.9967 
F-statistic:   688 on 4 and 5 DF,  p-value: 4.896e-07 

here I have I(a^2) for both fits and i want 1 1st order and another with second order polynomials. If one tries with:

 Lm2<-lm(c~a + b + I(b*I(a^2)), data=DumbData)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels
In addition: Warning message:
In Ops.factor(b, I(a^2)) : * not meaningful for factors

How can I get the proper interaction terms here???

Thanks Andrie, there are still some things I am missing here. In this example the variable b is a logic one, if is a factor of two levels does not work, I guess I have to convert the factor variable in a logic one. The other thing I am missing is the not in the condition, I(!b*a^2) without the ! I get:

    Call: lm(formula = c ~ a + I(b * a^2), data = dat) 
Coefficients: Estimate Std. Error t value Pr(>|t|) 
(Intercept) 7.2692 1.8425 3.945 0.005565 ** 
a           2.3222 0.3258 7.128 0.000189 *** 
I(b * a^2)  0.3005 0.0355 8.465 6.34e-05 ***

I can not relate the formulas with and without the ! condition, which is a bit strange to me.

4

2 回答 2

1

嗯...

Lm2<-lm(c~a + b + b*I(a^2), data=DumbData)

你说“我遇到的问题是我想在因子为“真”时拟合线性模型(一阶多项式),当因子为“假”时拟合二阶多项式。我怎样才能完成使用只有一lm。”

由此我推断您不希望 b 直接在模型中?此外,仅当 b 为假时才应包含 a^2。

所以那将是...

lm(c~ a + I((!b) * a^2))

如果 b 为真(即 !b 等于 FALSE),则 a^2 乘以零 (FALSE) 并从等式中省略。

唯一的问题是您将 b 定义为 factor 而不是logical. 那是可以治愈的。

# b=factor(c("true","false","true","false","true","false","true","false","true","false"))
# could use TRUE and FALSE instead of "ture" and "false"
# alternatively, after defining b as above, do
# b <- b=="true" -- that would convert b to logical (i.e boolean TRUE and FALSe values)

好的,确切地说,您将 b 定义为“字符”,但在将其添加到数据框(“DumbData”)时将其转换为“因子”

关于您定义数据框的方式的另一个小问题。

a=c(1,2,3,4,5,6,7,8,9,10)
b=factor(c("true","false","true","false","true","false","true","false","true","false"))
c=c(10,8,20,15,30,21,40,25,50,31)
DumbData<-data.frame(cbind(a,c))
DumbData<-cbind(DumbData,b=b)

在这里,cbind 是不必要的。您可以将所有内容集中在一条线上:

Dumbdata<- data.frame(a,b,c)
# shorter and cleaner!!

此外,要将 b 转换为logical使用:

Dumbdata<- data.frame(a,b=b=="true",c)

笔记。您需要说 b=b=="true",这似乎是多余的,但是 LHS (b) 给出了数据框中变量的名称,而 RHS (b=="true") 是一个计算结果为“逻辑”(布尔)值。

于 2013-09-19T22:59:07.090 回答
1

尝试以下几行:

dat <- data.frame(
  a=c(1,2,3,4,5,6,7,8,9,10),
  b=c(TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE),
  c=c(10,8,20,15,30,21,40,25,50,31)
)

fit <- lm(c ~ a + I(!b * a^2), dat)
summary(fit)

这导致:

Call:
lm(formula = c ~ a + I(!b * a^2), data = dat)

Residuals:
   Min     1Q Median     3Q    Max 
 -4.60  -2.65   0.50   2.65   4.40 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)      10.5000     2.6950   3.896 0.005928 ** 
a                 3.9000     0.4209   9.266 3.53e-05 ***
I(!b * a^2)TRUE -13.9000     2.4178  -5.749 0.000699 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 3.764 on 7 degrees of freedom
Multiple R-squared: 0.9367, Adjusted R-squared: 0.9186 
F-statistic: 51.75 on 2 and 7 DF,  p-value: 6.398e-05 

笔记:

  • 我利用了逻辑值TRUEFALSE.
  • 这些将分别强制为 1 和 0。
  • !b我在公式中使用了否定。
于 2013-04-26T21:34:12.933 回答