0

我正在使用 R 中 car 包中的 dataEllipse 函数来获取我的数据的椭圆置信区域。例如:

datapoints_x = c(1,3,5,7,8,6,5,4,9)
datapoints_y = c(3,6,8,9,5,8,7,4,8)
ellipse = dataEllipse(cbind(datapoints_x, datapoints_y), levels=0.95)

输出是两个向量 x 和 y 对应于定义椭圆的点:

head(ellipse)
#             x        y
# [1,] 12.79906 10.27685
# [2,] 12.74248 10.84304
# [3,] 12.57358 11.34255
# [4,] 12.29492 11.76781
# [5,] 11.91073 12.11238
# [6,] 11.42684 12.37102

但不是我感兴趣的是省略号轴的长度及其中心。有没有办法在不自己执行 PCA 的情况下做到这一点?

4

1 回答 1

4

From ?dataEllipse you read that these functions are mostly plotting functions, not functions designed to give you the fitted ellipse. However reading the source code of dataEllipse, it becomes clear that the function used to fit the ellipse is cov.wt from the stats package. This function should be able to give you the center and covariance matrix used to specify the ellipse location and shape:

set.seed(144)
x <- rnorm(1000)
y <- 3*x + rnorm(1000)
(ell.info <- cov.wt(cbind(x, y)))
# $cov
#          x         y
# x 1.022985  3.142274
# y 3.142274 10.705215
# 
# $center
#           x           y 
# -0.09479274 -0.23889445 
# 
# $n.obs
# [1] 1000

The center of the ellipse is now readily available from ell.info$center. The directions of the axes are accessible as the eigenvectors of the covariance matrix (columns of eigen.info$vectors below).

(eigen.info <- eigen(ell.info$cov))
# $values
# [1] 11.63560593  0.09259443
# 
# $vectors
#           [,1]       [,2]
# [1,] 0.2839051 -0.9588524
# [2,] 0.9588524  0.2839051

Finally you need to know the length of the axes (I'll give the length from the center to the ellipse, aka the radius on that axis):

(lengths <- sqrt(eigen.info$values * 2 * qf(.95, 2, length(x)-1)))
# [1] 8.3620448 0.7459512

Now we can get the four endpoints of the axes of the ellipse:

ell.info$center + lengths[1] * eigen.info$vectors[,1]
#        x        y 
# 2.279234 7.779072 
ell.info$center - lengths[1] * eigen.info$vectors[,1]
#         x         y 
# -2.468820 -8.256861 
ell.info$center + lengths[2] * eigen.info$vectors[,2]
#           x           y 
# -0.81004983 -0.02711513 
ell.info$center - lengths[2] * eigen.info$vectors[,2]
#          x          y 
#  0.6204643 -0.4506738 

We can confirm these are accurate from using dataEllipse:

library(car)
dataEllipse(x, y, levels=0.95)

enter image description here

于 2015-06-14T02:39:12.650 回答