1

I know that R automatically creates dummy variables from categorical values, but it also automatically chooses the reference value (I think alphabetically?). How do I specify a different value to be the reference without changing the names of the values? I realize I could probably relabel the factors a,b,c... in the order I prefer, but that seems kind of kludgey to me.

Just to be clear, I'll make up an example. Let's say the factor is color and the values are red, blue, green, and yellow.

mod.lm <- lm(preference ~ color, data = flowers)

The intercept in this case would be for the case color = blue, but I want to make it yellow. How would I go about doing that?

4

1 回答 1

3

Use relevel:

  # In this case, the reference category is setosa
model <- lm(Sepal.Length ~ Species, data=iris)
summary(model) 

# Now I want Virginica to be the reference category
iris$Species <- relevel(iris$Species, ref='virginica')
model <- lm(Sepal.Length ~ Species, data=iris)
summary(model)

In your case it could be

flowers$color <- relevel(flowers$color, ref='yellow')
lm(preference ~ color, data = flowers)

And this model will give you the estimation using as the ref category 'yellow'

于 2012-07-30T18:33:04.370 回答