r - 分类变量的影响编码与lavaan中的相互作用？

Question

我有兴趣将 lm-syntax 翻译为 lavaan，特别是当因子具有> 2 levels时，我在因子 x 数值变量之间进行效果编码交互之后。（提醒：效果编码是虚拟编码分类变量的替代方法，编码为 -1、1 和 0。）

下面你会看到一个最小的例子（毫无意义）。您会看到 lm（线性回归）语法，然后是相应的 lavaan 语法（回归部分）。它适用于没有交互但不适用于交互的回归。

首先考虑具有效应编码因子的无交互回归。

这有效

library(lavaan)
# Use iris data as minimal example
# 
# 1. Linear regression model
# Change contrasts to effects-coding
contrasts(iris$Species) <- contr.sum(3)
# Linear regression
lmmodel <- Sepal.Length ~ Species # the regression model
lmfit <- lm(lmmodel, iris) # fit it

# 2. SEM
# first, re-code the factors
iris$s1 <- contrasts(iris$Species)[iris$Species, 1] # Numeric and effects-coed
iris$s2 <- contrasts(iris$Species)[iris$Species, 2] #     - " -
semmodel <- 'Sepal.Length ~ s1 + s2' # the SEM model
semfit <- sem(semmodel, iris) # fit it

# 3. Compare the coefficients lm vs. sem, should be equal (and are equal)
cbind(coef(lmfit)[-1], coef(semfit)[-length(coef(semfit))])
#                 [,1]        [,2]
# Species1 -0.83733333 -0.83733330
# Species2  0.09266667  0.09266664

这是我如何通过交互来做到这一点 我哪里出错了？

# 1. Linear regression w/ interaction
lmmodel <- Sepal.Length ~ Species + Species:Sepal.Width
lmfit <- lm(lmmodel, iris)

# 2. SEM
iris$s3 <- as.numeric(iris$Species=='virginica') # Code third species
iris$s1_w <- iris$s1 * iris$Sepal.Width # Numeric interaction
iris$s2_w <- iris$s2 * iris$Sepal.Width #      - " -
iris$s3_w <- iris$s3 * iris$Sepal.Width #      - " -"
semmodel <- 'Sepal.Length ~ s1 + s2 + s1_w + s2_w + s3_w'
semfit <- sem(semmodel, iris)

# 3. Compare the coefficients lm vs. sem
cbind(coef(lmfit)[-1], coef(semfit)[-length(coef(semfit))])
#                                     [,1]       [,2]
# Species1                      -0.7228562 -0.7228566
# Species2                       0.1778772  0.1778772
# Speciessetosa:Sepal.Width      0.6904897  0.6904899
# Speciesversicolor:Sepal.Width  0.8650777  0.8650779  <----- equal
# Speciesvirginica:Sepal.Width   0.9015345  2.4571023  <----- not equal

score 0 · Accepted Answer

问题不在于lavaan，您只是没有正确编码对比Virginica Species：

从第 101 行到第 150 行，您应该有0,0,1，即：

iris[101:150,"s2_w"] <- 0
iris[101:150,"s1_w"] <- 0

重新运行原始代码：

semmodel <- 'Sepal.Length ~ s1 + s2 + s1_w + s2_w + s3_w'
semfit <- sem(model = semmodel, data = iris, estimator="ml")

# 3. Compare the coefficients lm vs. sem
cbind(coef(lmfit)[-1], coef(semfit)[-length(coef(semfit))])

并检查：

(¬_¬)# 3. Compare the coefficients lm vs. sem
(¬_¬)cbind(coef(lmfit)[-1], coef(semfit)[-length(coef(semfit))])
                                    [,1]       [,2]
Species1                      -0.7228562 -0.7228563
Species2                       0.1778772  0.1778772
Speciessetosa:Sepal.Width      0.6904897  0.6904898
Speciesversicolor:Sepal.Width  0.8650777  0.8650778
Speciesvirginica:Sepal.Width   0.9015345  0.9015345

r - 分类变量的影响编码与lavaan中的相互作用？

1 回答 1

Related

Reference