1

假设我有两个 PCA 加载矩阵loa.origloa.rot,并且我知道这loa.rot是 的旋转(手动或其他方式)loa.orig

loa.orig也可能已经varimax 或其他东西正交旋转,但我认为这并不重要)。

我知道想知道旋转到达的角度loa.origloa.rot

我从对另一个问题的评论中了解到“旋转不通勤”,因此,成对(平面)旋转的顺序也很重要

因此,要从中复制loa.rotloa.orig我需要知道一系列必要的旋转,最好按照下面给出的顺序rots

这是一个MWE:

library(psych) # this allows manual rotation

# suppose I have some ORIGINAL loadings matrix, from a principal components analysis, with three retained components
loa.orig <- cbind(c(0.6101496, 0.7114088, 0.3356003, 0.7318809, 0.5980133, 0.4102817, 0.7059148, 0.6080662, 0.5089014, 0.587025, 0.6166816, 0.6728603, 0.7482675, 0.5409658, 0.6415472, 0.3655053, 0.6313868), c(-0.205317, 0.3273207, 0.7551585, -0.1981179, -0.423377, -0.07281187, -0.04180098, 0.5003459, -0.504371, 0.1942334, -0.3285095, 0.5221494, 0.1850734, -0.2993066, -0.08715662, -0.02191772, -0.2002428), c(-0.4692407, 0.1581682, -0.04574932, -0.1189175, 0.2449018, -0.5283772, 0.02826476, 0.1703277, 0.2305158, 0.2135566, -0.2783354, -0.05187637, -0.104919, 0.5054129, -0.2403471, 0.5380329, -0.07999642))

# I then rotate 1-2 by 90°, and 1-3 by 45°
loa.rot <- factor.rotate(f = loa.orig, angle = 90, col1 = 1, col2 = 2)
loa.rot <- factor.rotate(f = loa.rot, angle = 45, col1 = 1, col2 = 3)

# predictably, loa.rot and loa.orig are now different
any(loa.rot == loa.orig) # are any of them the same?

显然,在这种情况下,我知道角度和顺序,但假设我不知道。另外,让我们假设在实际用例中可能有许多组件被保留和旋转,而不仅仅是三个。

我有点不确定报告组件对(平面)旋转角度顺序的传统方法是什么,但我想可能的组合列表(~~不是排列~~)应该做。

combs <- combn(x = ncol(loa.orig), m = 2, simplify = TRUE)  # find all possible combinations of factors
rots <- data.frame(t(combs), stringsAsFactors = FALSE)  # transpose
rots  # these rows give the *order* in which the rotations should be done

rots给出这些排列。

loa.rot很高兴知道如何从loa.orig中的行给出的旋转组件对到达rots


更新:尝试基于以下答案

根据以下答案,我尝试将一个函数组合在一起并使用varimax旋转和真实数据集对其进行测试。(没有特别的理由varimax——我只是想要一些我们实际上不知道角度的旋转。)。

然后我测试我是否可以使用提取的角度从香草加载中重新创建 varimax 旋转。

library(psych)
data("Harman74.cor")  # notice the correlation matrix is called "cov", but doc says its a cor matrix
vanilla <- principal(r = Harman74.cor$cov, nfactors = 4, rotate = "none", )$loadings  # this is unrotated
class(vanilla) <- NULL  # print methods only causes confusion
varimax <- principal(r = Harman74.cor$cov, nfactors = 4, rotate = "varimax")$loadings  # this is rotated
class(varimax) <- NULL  # print methods only causes confusion

find.rot.instr <- function(original, rotated) {
  # original <- vanilla$loadings  # testing
  # rotated <- varimax$loadings  # testing
  getAngle <- function(A, B) acos(sum(A*B) / (norm(A, "F") * norm(B, "F"))) * 180/pi

  rots <- combn(x = ncol(original), m = 2, simplify = FALSE)  # find all possible combinations of factor pairs

  tmp <- original
  angles <- sapply(rots, function(cols) {
    angle <- getAngle(tmp[, cols], rotated[, cols])
    tmp <<- factor.rotate(tmp, angle = angle, col1 = cols[1], col2 = cols[2])
    return(angle)
  })
  return(angles)
}

vanilla.to.varimax.instr <- find.rot.instr(original = vanilla, rotated = varimax)  # these are the angles we would need to transform in this order

rots <- combn(x = ncol(vanilla), m = 2, simplify = FALSE)  # find all possible combinations of factor pairs
# this is again, because above is in function

# now let's implement the extracted "recipe"
varimax.recreated <- vanilla # start with original loadings
varimax.recreated == vanilla  # confirm that it IS the same
for (i in 1:length(rots)) {  # loop over all combinations, starting from the top
  varimax.recreated <- factor.rotate(f = varimax.recreated, angle = vanilla.to.varimax.instr[i], col1 = rots[[i]][1], col2 = rots[[i]][2])
}
varimax == varimax.recreated  # test whether they are the same
varimax - varimax.recreated  # are the close?

不幸的是,它们不一样,甚至不相似:(

> varimax == varimax.recreated  # test whether they are the same
PC1   PC3   PC2   PC4
VisualPerception       FALSE FALSE FALSE FALSE
Cubes                  FALSE FALSE FALSE FALSE
PaperFormBoard         FALSE FALSE FALSE FALSE
Flags                  FALSE FALSE FALSE FALSE
GeneralInformation     FALSE FALSE FALSE FALSE
PargraphComprehension  FALSE FALSE FALSE FALSE
SentenceCompletion     FALSE FALSE FALSE FALSE
WordClassification     FALSE FALSE FALSE FALSE
WordMeaning            FALSE FALSE FALSE FALSE
Addition               FALSE FALSE FALSE FALSE
Code                   FALSE FALSE FALSE FALSE
CountingDots           FALSE FALSE FALSE FALSE
StraightCurvedCapitals FALSE FALSE FALSE FALSE
WordRecognition        FALSE FALSE FALSE FALSE
NumberRecognition      FALSE FALSE FALSE FALSE
FigureRecognition      FALSE FALSE FALSE FALSE
ObjectNumber           FALSE FALSE FALSE FALSE
NumberFigure           FALSE FALSE FALSE FALSE
FigureWord             FALSE FALSE FALSE FALSE
Deduction              FALSE FALSE FALSE FALSE
NumericalPuzzles       FALSE FALSE FALSE FALSE
ProblemReasoning       FALSE FALSE FALSE FALSE
SeriesCompletion       FALSE FALSE FALSE FALSE
ArithmeticProblems     FALSE FALSE FALSE FALSE
> varimax - varimax.recreated  # are the close?
PC1         PC3          PC2       PC4
VisualPerception        0.2975463  1.06789735  0.467850675 0.7740766
Cubes                   0.2317711  0.91086618  0.361004861 0.4366521
PaperFormBoard          0.1840995  0.98694002  0.369663215 0.5496151
Flags                   0.4158185  0.82820078  0.439876777 0.5312143
GeneralInformation      0.8807097 -0.33385999  0.428455899 0.7537385
PargraphComprehension   0.7604679 -0.30162120  0.389727192 0.8329341
SentenceCompletion      0.9682664 -0.39302764  0.445263121 0.6673116
WordClassification      0.7714312  0.03747430  0.460461099 0.7643221
WordMeaning             0.8010876 -0.35125832  0.396077591 0.8201986
Addition                0.4236932 -0.32573100  0.204307400 0.6380764
Code                    0.1654224 -0.01757153  0.194533996 0.9777764
CountingDots            0.3585004  0.28032822  0.301148474 0.5929926
StraightCurvedCapitals  0.5313385  0.55251701  0.452293566 0.6859854
WordRecognition        -0.3157408 -0.13019630 -0.034647588 1.1235253
NumberRecognition      -0.4221889  0.10729098 -0.035324356 1.0963785
FigureRecognition      -0.3213392  0.76012989  0.158748259 1.1327322
ObjectNumber           -0.3234966 -0.02363732 -0.007830001 1.1804147
NumberFigure           -0.2033601  0.59238705  0.170467459 1.0831672
FigureWord             -0.0788080  0.35303097  0.154132395 0.9097971
Deduction               0.3423495  0.41210812  0.363022937 0.9181519
NumericalPuzzles        0.3573858  0.57718626  0.393958036 0.8206304
ProblemReasoning        0.3430690  0.39082641  0.358095577 0.9133117
SeriesCompletion        0.4933886  0.56821932  0.465602192 0.9062039
ArithmeticProblems      0.4835965 -0.03474482  0.332889805 0.9364874

很明显,我犯了一个错误。

4

2 回答 2

2

编辑:任意维数的方法

我现在有一种方法可以找到旋转矩阵的任意维数的欧拉角的类比(尽管随着维数的增加它的计算量越来越大)。此方法适用于样本数据集和 2 到 6 个因子的 varimax。我没有测试超过 6 个。对于 5 个和 6 个因素,似乎为第 5 列添加了反射 - 我不确定是什么决定了哪些列得到反映,因此目前将其硬编码到示例中。

该方法与以前一样通过使用 生成旋转矩阵来工作lm.fit。然后,它yacas使用符号矩阵乘法来计算复合旋转矩阵,以在适当的维数上进行任意旋转。然后迭代地求解角度。矩阵中总会有一个元素是基于sin一个角度的,然后这可以用来迭代地计算其他角度的值。

输出包括输入中实际不同的列子集、使用线性模型生成的复合旋转/反射矩阵、角度和列方面的各个旋转、计算的复合旋转矩阵、所需的反射/列交换矩阵对于某些示例,以及计算的旋转和输入之间的差异的平方和(对于所有示例,这是 的数量级1e-20)。

与我原来的解决方案不同,这只提供了一种可能的旋转顺序组合。实际可能性的数量(即使只是相当于Tait-Bryan 角度factorial(n * (n-1) / 2)的维度也是n维度,对于 6 个维度来说,大约是1.3e12..

library("psych")
library("magrittr")
library("stringr")
library("Ryacas")
rot_mat_nd <- function(dimensions, composite_var = NULL,
                       rot_order = combn(dimensions, 2, simplify = FALSE)) {
  d <- diag(dimensions)
  storage.mode(d) <- "character"
  mats <- lapply(seq(rot_order), function(i) {
    l <- paste0("a", i)
    cmb <- rot_order[[i]]
    d[cmb[1], cmb[1]] <- sprintf("cos(%s)", l)
    d[cmb[1], cmb[2]] <- sprintf("-sin(%s)", l)
    d[cmb[2], cmb[1]] <- sprintf("sin(%s)", l)
    d[cmb[2], cmb[2]] <- sprintf("cos(%s)", l)
    paste0("{{",
           paste(apply(d, 1, paste, collapse = ","), collapse = "},{"),
           "}}")
  })

  yac_statement <- paste0("Simplify(", paste(mats, collapse = "*"), ")")
  if (!is.null(composite_var)) {
    yac_statement <- paste0(composite_var, " := ", yac_statement)
  }
  output <- yacas(yac_statement)
  list(mats = mats, composite = output, rot_order = rot_order)
}


find_angles_nd <- function(input, rotated, reflect = NULL) {
  matched_cols <- sapply(1:ncol(input), function(i)
    isTRUE(all.equal(input[, i], rotated[, i])))
  dimensions <- sum(!matched_cols)
  theor_rot <- rot_mat_nd(dimensions, "r")
  rv <- yacas("rv := Concat @ r")
  swap_mat <- matrix(0, dimensions, dimensions)
  swap_mat[cbind(1:dimensions, match(colnames(input), colnames(rotated)))] <- 1
  if (!is.null(reflect)) {
    swap_mat[, reflect] <- -swap_mat[, reflect]
  }
  input_changed <- input[, !matched_cols]
  rotated_changed <- rotated[, !matched_cols]
  rotated_swapped <- rotated_changed %*% swap_mat
  rot_mat <- lm.fit(input_changed, rotated_swapped)$coef
  rot_mat_v <- c(t(rot_mat))

  known_angles <- numeric()
  angles_to_find <- nrow(rot_mat) * (nrow(rot_mat) - 1) / 2

  iterations <- 0L
  angles_found <- -1
  while(length(known_angles) < angles_to_find & angles_found != 0) {
    iterations <- iterations + 1L
    message(sprintf("Iteration %d; angles remaining at start %d",
                iterations, angles_to_find - length(known_angles)))
    yacas(sprintf("rvwv := WithValue({%s}, {%s}, rv)",
                  paste(names(known_angles), collapse = ", "),
                  paste(known_angles, collapse = ", ")
    ))
    var_num <- yacas("MapSingle(Length, MapSingle(VarList, rvwv))") %>%
      as.expression %>%
      eval %>%
      unlist
    angles_found <- 0L
    for (i in which(var_num == 1)) {
      var <- as.character(yacas(sprintf("VarList(rvwv[%d])[1]", i)))
      if (!(var %in% names(known_angles))) {
        to_solve <- as.character(yacas(sprintf("rvwv[%d]", i)))
        fun_var <- str_extract(to_solve, sprintf("(sin|cos)\\(%s\\)", var))
        fun_c <- substr(fun_var, 1, 3)
        if (fun_c == "sin") {
          to_solve_mod <- str_replace(to_solve, fixed(fun_var), "x")
          solved <- as.character(yacas(sprintf("Solve(%s == %0.15f, x)[1]", to_solve_mod, rot_mat_v[i])))
          answer <- asin(eval(parse(text = str_replace(solved, "x == ", ""))))
          known_angles <- c(known_angles, setNames(answer, var))
          angles_found <- angles_found + 1L
        }
      }
    }
    message(sprintf("- found %d", angles_found))
  }
  calc_rot_mat <-
    matrix(unlist(simplify2array(eval(
      as.expression(theor_rot$composite),
      as.list(known_angles)
    ))), dimensions, byrow = TRUE)

  ssd <- sum((input_changed %*% calc_rot_mat %*% swap_mat - rotated_changed) ^ 2)

  angles <- known_angles[paste0("a", 1:angles_to_find)] / pi * 180

  list(rot_mat = rot_mat, calc_rot_mat = calc_rot_mat, swap_mat = swap_mat, angles = angles,
       rot_order = theor_rot$rot_order, input_changed = input_changed,
       rotated_changed = rotated_changed, sum_square_diffs = ssd)
}

factor_rotate_multiple <- function(input_changed, angles, rot_order, swap_mat) {
  rotated <- input_changed
  for (i in 1:length(angles)) {
    rotated <- factor.rotate(rotated, angles[i], rot_order[[i]][1], rot_order[[i]][2])
  }
  rotated %*% swap_mat
}

2-6 维示例

data("Harman74.cor")  # notice the correlation matrix is called "cov", but doc says its a cor matrix

example_nd <- function(dimensions, reflect = NULL) {
  find_angles_nd(
    unclass(principal(r = Harman74.cor$cov, nfactors = dimensions, rotate = "none")$loadings),
    unclass(principal(r = Harman74.cor$cov, nfactors = dimensions, rotate = "varimax")$loadings),
    reflect = reflect
  )
}

angles_2d <- example_nd(2)

angles_2d[c("angles", "rot_order", "sum_square_diffs")]
# shows the resultant angle in degrees, rotation order and the
# sum of the squares of the differences between the provided and calculated
# rotations

#$angles
#       a1 
#-39.88672 
#
#$rot_order
#$rot_order[[1]]
#[1] 1 2
#
#
#$sum_square_diffs
#[1] 8.704914e-20

angles_3d <- example_nd(2)

angles_3d[c("angles", "rot_order", "sum_square_diffs")]
#$angles
#       a1        a2        a3 
#-45.19881 -29.77423 -17.07210 
#
#$rot_order
#$rot_order[[1]]
#[1] 1 2
#
#$rot_order[[2]]
#[1] 1 3
#
#$rot_order[[3]]
#[1] 2 3
#
#
#$sum_square_diffs
#[1] 7.498253e-20

angles_4d <- example_nd(2)
angles_5d <- example_nd(2, reflect = 5)
angles_6d <- example_nd(2, reflect = 5)

原来的

这个问题可以分成两个。第一部分是计算将输入与输出相关联的复合旋转矩阵。即在等式A %*% B == C中,BA和中计算出来C。对于方阵,这可以用 来完成solve。然而,在这种情况下,对于行 > 列,最简单的方法是使用线性模型和lm.fit函数。

问题的第二部分是识别为产生复合旋转矩阵而执行的旋转。有无数种可能的组合,但一般来说(对于 3 列)围绕三个轴进行一系列旋转是合理的,即使用欧拉角。即使这样,也有六种可能的轮换顺序。对于两列,这个问题是微不足道的,因为只需要一个旋转,一个asinacos足以识别角度。对于超过 3 列,可以推广欧拉角方法,但会变得更加复杂。

这是 R 中使用线性模型查找复合旋转矩阵和RSpincalc包查找欧拉角的完整方法。它假设旋转影响了 3 列,如给出的示例中所示。

library("RSpincalc")
library("combinat")
find_rotations <- function(input, rotated) {
  matched_cols <- sapply(1:ncol(input), function(i) isTRUE(all.equal(input[, i], rotated[, i])))
  if (sum(!matched_cols) != 3) {
    stop("This method only works for rotations affecting 3 columns.")
  }
  rot_mat <- lm.fit(input[, !matched_cols], rotated[, !matched_cols])$coef
  rot_poss <- as.data.frame(do.call("rbind", permn(c("z", "y", "x"))), stringsAsFactors = FALSE)
  rot_poss$axes_for_EA <- apply(rot_poss, 1, function(x) paste(rev(x), collapse = ""))
  combo_cols <- as.data.frame(matrix(which(!matched_cols)[combn(3, 2)], 2))
  rot_poss[5:10] <- do.call("rbind", permn(combo_cols, unlist))
  names(rot_poss)[c(1:3, 5:10)] <- c(paste0("axis", 1:3),
    paste0("rot", rep(1:3, each = 2), "_c", rep(1:2, 3)))
  rot_poss[paste0("angle", 1:3)] <- t(round(sapply(rot_poss$axes, DCM2EA, DCM = rot_mat), 14)) * 180 / pi
  rot_poss[paste0("angle", 1:3)] <-
    lapply(1:3, function(i) {
      ifelse(rot_poss[, paste0("axis", i)] == "y", 1, -1) *
        rot_poss[, paste0("angle", 4 - i)]
    })
  rot_poss
}

对于 OP 的数据:

rot_poss <- find_rotations(loa.orig, loa.rot)

另一个更全面的演示:

set.seed(123)
input <- matrix(rnorm(65), 13)

library("magrittr")

rotated <- input %>%
  factor.rotate(33.5, 1, 3) %>%
  factor.rotate(63, 2, 3) %>%
  factor.rotate(-3, 1, 2)

rot_poss <- find_rotations(input, rotated)

rot_poss

#  axis1 axis2 axis3 axes_for_EA rot1_c1 rot1_c2 rot2_c1 rot2_c2 rot3_c1 rot3_c2
#1     z     y     x         xyz       1       2       1       3       2       3
#2     z     x     y         yxz       1       2       2       3       1       3
#3     x     z     y         yzx       2       3       1       2       1       3
#4     x     y     z         zyx       2       3       1       3       1       2
#5     y     x     z         zxy       1       3       2       3       1       2
#6     y     z     x         xzy       1       3       1       2       2       3
#     angle1    angle2   angle3
#1 -1.585361 30.816825 63.84410
#2 44.624426 50.431683 53.53631
#3 59.538985 26.581047 16.27152
#4 66.980038 14.511490 27.52974
#5 33.500000 63.000000 -3.00000
#6 30.826477 -1.361477 63.03177

possible_calls <- do.call("sprintf", c(list(fmt = "input %%>%%
  factor.rotate(%1$g, %4$g, %5$g) %%>%%
  factor.rotate(%2$g, %6$g, %7$g) %%>%%
  factor.rotate(%3$g, %8$g, %9$g)"),
  rot_poss[c(paste0("angle", 1:3),
    grep("^rot", names(rot_poss), value = TRUE))]))
cat(possible_calls, sep = "\n\n")

#input %>%
  #factor.rotate(-1.58536, 1, 2) %>%
  #factor.rotate(30.8168, 1, 3) %>%
  #factor.rotate(63.8441, 2, 3)

#input %>%
  #factor.rotate(44.6244, 1, 2) %>%
  #factor.rotate(50.4317, 2, 3) %>%
  #factor.rotate(53.5363, 1, 3)

#input %>%
  #factor.rotate(59.539, 2, 3) %>%
  #factor.rotate(26.581, 1, 2) %>%
  #factor.rotate(16.2715, 1, 3)

#input %>%
  #factor.rotate(66.98, 2, 3) %>%
  #factor.rotate(14.5115, 1, 3) %>%
  #factor.rotate(27.5297, 1, 2)

#input %>%
  #factor.rotate(33.5, 1, 3) %>%
  #factor.rotate(63, 2, 3) %>%
  #factor.rotate(-3, 1, 2)

#input %>%
  #factor.rotate(30.8265, 1, 3) %>%
  #factor.rotate(-1.36148, 1, 2) %>%
  #factor.rotate(63.0318, 2, 3)

lapply(possible_calls, function(cl) all.equal(eval(parse(text = cl)), rotated, tolerance = 1e-6))
# tolerance reduced because above calls have rounding to 6 significant figures

请注意,最后一位将使用magrittr可以重现旋转的管道输出调用。另请注意,有 6 种可能的旋转顺序,每种顺序具有不同的角度。

为了使角度匹配,我必须在 y 旋转时翻转符号。我还不得不翻转轮换顺序。

对于varimax更新,我的方法适用,nfactor = 3但需要调整为 4 或更高。

于 2015-07-18T23:08:47.030 回答
1

我认为如果你做矩阵代数问题会更容易(我这样做是为了正交旋转,而不是倾斜变换,但逻辑是一样的。)

首先,请注意,任何旋转 t 都会产生一个旋转分量矩阵 c,使得 c = Ct,其中 C 是未旋转的 PCA 解。

然后,由于 C'C 是原始的相关矩阵 R,我们可以通过将两边预乘以 C',然后乘以 R 的逆来求解 t。这导致

t = C' R^-1 c

对于你的例子,让

R <- Harman74.cor$cov
P <- principal(Harman74.cor$cov,nfactors=4,rotate="none")
p <- principal(Harman74.cor$cov,nfactors=4)  #the default is to do varimax
rotation <- t(P$loadings) %*% solve(R) %*% p$loadings

然后,看看这是否正确

P$loadings %*% rotation - p$loadings

此外,即将发布的 psych 现在报告旋转矩阵,因此我们可以将旋转与 p$rot.mat 进行比较

round(p$rot.mat,2)
round(rotation,2)

这些在组件的顺序上有所不同,但这是因为 psych 在旋转后重新排序组件以反映旋转组件平方和的大小。

于 2015-08-29T00:03:23.200 回答