r - 为什么“Varimax”和“varimax”之间的 psych::principal 存在差异？

Question

在一个相关问题stats::varimax中，我询问了为什么和之间存在差异GPArotation::Varimax，这两个psych::principal调用取决于为设置的选项rotate =。

这两者之间的差异（参见其他问题）解释了与psych::principal. 似乎这些差异会以某种方式加剧。psych::principal（我有一个简单的理论为什么，我想得到证实）。

library(GPArotation)
library(psych)
data("Thurstone")

principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none")  # find unrotated PCs first
loa <- unclass(principal.unrotated$loadings)

# just to compare that the rot.mat is correct
varimax.stats <- stats::varimax(x = loa, normalize = TRUE)
varimax.GPA <- GPArotation::Varimax(L = loa, normalize = TRUE)
# notice we're here NOT interested in the difference between stats and GPA, that's the other question

diff.from.rot.meth <- unclass(varimax.stats$loadings - varimax.GPA$loadings)  # very small differences, see this question: https://stackoverflow.com/questions/32350891/why-are-there-differences-between-gparotationvarimax-and-statsvarimax
mean(abs(diff.from.rot.meth))
#> [1] 8.036863e-05

principal.varimax.stats <- principal(r = Thurstone, nfactors = 4, rotate = "varimax")
principal.Varimax.GPA <- principal(r = Thurstone, nfactors = 4, rotate = "Varimax")

diff.from.princ <- principal.Varimax.GPA$rot.mat - principal.varimax.stats$rot.mat  # quite a substantial change, because Theta is NOT a rotmat, that makes sense
mean(abs(diff.from.princ))
#> [1] 0.021233

mean(abs(diff.from.rot.meth)) - mean(abs(diff.from.princ))  # principal has MUCH bigger differences
#> [1] -0.02115263

对于浮点伪像或其他东西来说，这似乎太大了。

我的假设是（额外的）差异源于默认为 (Kaiser) 的事实GPArotation::Varimax，normalize == FALSE而 **stats::varimax默认为 (Kaiser)normalize == TRUE，不能在 `principal::psych` 中进行不同的设置。

stats::varimax手动的：

## varimax with normalize = TRUE is the default

GPArotation::Varimax/GPArotation::GPForth手册：

参数 normalize 指示是否以及如何在旋转之前进行任何规范化，然后在旋转后撤消。如果 normalize 为 FALSE（默认值），则不进行标准化。如果 normalize 为 TRUE，则完成 Kaiser 标准化。（所以归一化 A 的平方行条目总和为 1.0。这有时称为 Horst 归一化。）

此外，他们psych::Kaiser手动警告：

GPARotation 包（默认情况下）不规范化，fa 函数也不规范化。然后，为了更容易混淆，stats 中的 varimax 确实如此，而 GPARotation 中的 Varimax 则没有。

任何人都可以确认差异实际上是由标准化选项解释的吗？

score 1 · Accepted Answer

不同加载的问题是由于程序中的不同准确性而发生的（当然，由于在 psych::principalnormalize中未评估该选项，因此所有其他程序必须使用该选项切换为 TRUE）。虽然可以配置 stats::varimax 和 GPARotation::Varimax 的准确性（参数eps），但在 psych::principal 中忽略了这一点，并且似乎隐含地等于 eps=1e-5 的 stats::varimax 。

如果我们将 stats::varimax 和 GPArotations::Varimax 的准确性提高到，eps=1e-15那么我们得到的结果与我的 MatMate 程序中的同时实现中的结果相同（至少最多 8 位），我已经测试它非常准确地等于SPSS 计算也是如此。

在 psych::principal 中缺少对显式选项的处理eps似乎是一个错误，其糟糕的隐式默认值肯定不能令人满意。

有趣的是， GPARotation::Varimax 需要非常多的旋转eps=1e-15，见最后的输出；因此，要么执行另一个内部过程，要么在决定何时停止迭代时对 eps 参数进行不同的评估。一个例子，见这个答案的结尾，可能会暗示这种效果。

在比较的协议之下；仅显示载荷的前两行。

The first three computations with low accuracy (eps=1e-5) 
all procedures give equal or comparable results, 
except GPArotation, which is already near the exact answer
--------------------------------
>  principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings    *1000000
>  stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings            *1000000
>  GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings      *1000000

The second three computations with (attempted) high accuracy (eps=1e-15) 
all procedures except psych::principal give equal results, and they
agree also with external crosscheck using MatMate
--------------------------------
>  principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings*1000000
>  stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings*1000000
>  GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings*1000000

尝试很少/默认准确的结果有大eps或没有

# ===== Numerical documentation (only first two rows are displayed)================
> principal(r = Thurstone, nfactors = 4, rotate = "varimax")$loadings*1000000
                RC1       RC2       RC3       RC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99


> stats::varimax(x = loa, normalize = TRUE,eps=1e-5)$loadings         *1000000
                PC1       PC2       PC3       PC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-5)$loadings   *1000000
                     PC1      PC2      PC3       PC4
Sentences       871717.3 216618.7 198176.3 175204.47
Vocabulary      855663.1 294146.3 152930.7 180517.21
# =============================================================================

现在尝试用更小更准确的结果eps

> principal(r = Thurstone, nfactors = 4, rotate = "varimax",eps=1e-15)$loadings  *1000000
                RC1       RC2       RC3       RC4      
Sentences       871655.72 216638.46 198427.07 175202.57
Vocabulary      855609.28 294166.99 153181.45 180525.99

> stats::varimax(x = loa, normalize = TRUE,eps=1e-15)$loadings                   *1000000
                PC1       PC2       PC3       PC4      
Sentences       871716.83 216619.69 198174.31 175207.86
Vocabulary      855662.58 294147.47 152928.77 180519.37

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-10)$loadings             *1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.8 216619.7 198174.31 175207.86
Vocabulary      855662.6 294147.5 152928.77 180519.37

Warnmeldung:
In GPForth(L, Tmat = Tmat, method = "varimax", normalize = normalize,  :
  convergence not obtained in GPForth. 1000 iterations used.

# Result by MatMate: --------------------------------------------------------
 lad = cholesky(Thurstone) 
 pc = rot(lad,"pca")
 pc4 = pc[*,1..4]                           // arrive at the first four pc's
     t = gettrans( normzl(pc4),"varimax")   // get rotation-matrix for row-normalized pc's
 vmx = pc4 * t                              // rotate pc4 by rotation-matrix 
 display = vmx     * 1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.83    216619.68   198174.31   175207.87
Vocabulary      855662.58    294147.46   152928.77   180519.37


# ===============================================================================>

通过在 stat:: 中将 eps 设置为 1e-12 并在 GPArotation 中设置为 1e-6 ，可以实现 stats::varimax 和 GPARotation::Varimax 的结果更好的匹配，其中一个值是另一个值的平方。那么我们得到

> GPArotation::Varimax(L = loa, normalize = TRUE,eps=1e-6)$loadings*1000000
                     PC1      PC2       PC3       PC4
Sentences       871716.8 216619.8 198174.49 175207.63
Vocabulary      855662.5 294147.6 152928.94 180519.21

> stats::varimax(x = loa, normalize = TRUE,eps=1e-12)$loadings*1000000
                PC1       PC2       PC3       PC4      
Sentences       871716.80 216619.74 198174.40 175207.85
Vocabulary      855662.55 294147.52 152928.86 180519.36

score 0 · Accepted Answer

这似乎证实了应用（我认为，它是为此目的而构建的）将差异减少到和psych::kaiser之间的原始差异：stats::varimaxGPArotation::Varimax

principal.Varimax.GPA.kaiser <- kaiser(f = principal.unrotated, rotate = "Varimax")
diff.statsvari.gpavar.bothkaiser <- unclass(principal.Varimax.GPA.kaiser$loadings - principal.varimax.stats$loadings)
mean(abs(diff.statsvari.gpavar.bothkaiser))
#> [1] 8.036863e-05

这几乎是相同的结果，所以我认为假设得到证实。

psych::principal更大的差异源于normalize.

更新

对于各自的旋转矩阵（或其他），差异也（再次）小得多Th：

principal.Varimax.GPA.kaiser$Th - principal.varimax.stats$rot.mat  # those differences are very small now, too
#>               [,1]         [,2]          [,3]          [,4]
#> [1,]  1.380279e-04 1.380042e-05 -2.214319e-04 -2.279170e-06
#> [2,]  9.631517e-05 2.391296e-05  1.531723e-04 -3.371868e-05
#> [3,]  1.758299e-04 7.917460e-05  6.788867e-05  1.099072e-04
#> [4,] -9.548010e-05 6.500162e-05 -1.679753e-05 -5.213475e-05

r - 为什么“Varimax”和“varimax”之间的 psych::principal 存在差异？

2 回答 2

Related

Reference