0

Using this dataset:

http://pastebin.com/4wiFrsNg

and building on this question:

How to fit predefined offsets to models containing categorical variables in R

in order to test the validty of a model on another test dataset, I want to take the fitted model from:

ModelA<-lm(Response1~Categorical)

and fit it to relationship B:

Response2~Categorical

The response variables are identical in each case.

The above link provides a solution to how to fit an offset for the levels of a categorical variable, which for my data would involve:

# compute the offsets for each level of Categorical from the following model:

m<-lm(Response1~Categorical,data=dat)
summary(m)

#Create vector of offsets for variable

o <- with(dat, ifelse(Categorical == "Y", 0.25773, -0.25773))

#run second model with offsets from first model

m1<-lm(dat$Response2 ~ 1 + offset(o))

However when I check whether this works by specifying these known offsets to a relationship and then checking it with the identical model without offsets specified, thus:

# run model using Response1 to get values for slope offsets


m<-lm(Response1 ~ Categorical,data=dat)
summary(m)


# Specify offsets from this in the model of the same data (i.e. still using Response1)

o <- with(dat, ifelse(Categorical == "Y", 0.25773, -0.25773))
m1<-lm(dat$Response1 ~ 1 + offset(o))


#check the residuals from m and m2 are identical

m$residuals
m2$residuals

The residuals are different, showing that the method does not work.

I am thus wondering:

1) Does anyone have any other ideas idea how to specify offsets for the levels of a categorical variable? 2) Can you advise on how to specify and offset for the intercept terms for such a varible, in addition to offsets for the levels?

The latter is simple enough for a continuous variable as there is only one intercept:

 # run model using Response1 to get values for intercept and slope offsets 

m<-lm(Response1~log(Continuous),data=dat)
summary(m)
# Specify offsets for the intercept and slope for the model involving the second response variable
 m <- lm(Response2 ~ 0+offset(rep(0.22483, nrow(dat))) + offset( -0.07115*log(Continuous)))

But it is not clear to me how this would transfer to a categorical variable.

Many thanks.

4

1 回答 1

1

R 估计治疗对比。你显然来自一个你被教导期望 c(1,-1) 对比但没有学会看编码的世界。

m<-lm(Response1 ~ Categorical,data=dat)
summary(m)

o <- with(dat, ifelse(Categorical == "Y", 0.25773, 0))
m1<-lm(dat$Response1 ~ 1 + offset(o))

abs( m$residuals - m1$residuals) < 0.00001
   1    2    3    4    5    6    7    8    9   10   11   12   13   14 
TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
  15   16   17   18   19   20   21   22   23   24   25   26   27   28 
TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
  29   30   31   32   33   34   35   36   37   38   39   40   41   42 
TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 
  43   44 
TRUE TRUE 

与您的方法对比:

 o <- with(dat, ifelse(Categorical == "Y", 0.25773, -0.25773))

这给出了所有错误。看着:

 ?model.matrix
于 2013-07-02T16:34:13.323 回答