math - 如何生成对非对角元素有约束的伪随机正定矩阵？

Question

可能重复：
如何生成对非对角元素有约束的伪随机正定矩阵？

用户想要对 var/covar 矩阵中每对变量之间的相关性施加一个唯一的、非平凡的上/下界。

例如：我想要一个方差矩阵，其中所有变量都有0.9 > |rho(x_i,x_j)| > 0.6，是变量和rho(x_i,x_j)之间的相关性。x_ix_j

谢谢。

score 4 · Accepted Answer

这里有很多问题。

首先，是否假定伪随机偏差是正态分布的？我会假设它们是，因为如果我们偏离非正态分布，任何关于相关矩阵的讨论都会变得令人讨厌。

接下来，在给定协方差矩阵的情况下，生成伪随机正态偏差相当简单。生成标准正态（独立）偏差，然后通过乘以协方差矩阵的 Cholesky 因子进行变换。如果平均值不为零，则在末尾添加平均值。

而且，在给定相关矩阵的情况下，协方差矩阵的生成也相当简单。只需将相关矩阵前后乘以由标准差组成的对角矩阵。这会将相关矩阵缩放为协方差矩阵。

我仍然不确定问题出在哪里，因为生成“随机”相关矩阵似乎很容易，其中元素均匀分布在所需范围内。

因此，按照任何合理的标准，以上所有内容都是相当微不足道的，并且有许多工具可以根据上述信息生成伪随机正态偏差。

或许问题在于用户坚持生成的随机偏差矩阵必须在指定范围内具有相关性。您必须认识到，一组随机数只会在渐近意义上具有所需的分布参数。因此，随着样本量趋于无穷大，您应该会看到指定的分布参数。但任何小样本集不一定具有所需范围内的所需参数。

例如，（在 MATLAB 中）这里是一个简单的正定 3x3 矩阵。因此，它制作了一个非常好的协方差矩阵。

S = randn(3);
S = S'*S
S =
      0.78863      0.01123     -0.27879
      0.01123       4.9316       3.5732
     -0.27879       3.5732       2.7872

我将 S 转换为相关矩阵。

s = sqrt(diag(S));

C = diag(1./s)*S*diag(1./s)
C =
            1    0.0056945     -0.18804
    0.0056945            1      0.96377
     -0.18804      0.96377            1

现在，我可以使用统计工具箱从正态分布中采样（mvnrnd 应该可以解决问题。）使用 Cholesky 因子也很简单。

L = chol(S)
L =
      0.88805     0.012646     -0.31394
            0       2.2207       1.6108
            0            0      0.30643

现在，生成伪随机偏差，然后根据需要对其进行转换。

X = randn(20,3)*L;

cov(X)
ans =
      0.79069     -0.14297     -0.45032
     -0.14297       6.0607       4.5459
     -0.45032       4.5459       3.6549

corr(X)
ans =
            1     -0.06531      -0.2649
     -0.06531            1      0.96587
      -0.2649      0.96587            1

如果您希望相关性必须始终大于 -0.188，那么这种采样技术就失败了，因为这些数字是伪随机的。事实上，除非您的样本量足够大，否则该目标将很难实现。

您可以采用简单的拒绝方案，即进行采样，然后重复进行，直到样本具有所需的属性，并且相关性在所需的范围内。这可能会很累。

一种可能有效的方法（但我目前还没有完全想到的方法）是使用上述标准方案来生成随机样本。计算相关性。如果它们未能位于适当的范围内，则确定需要对数据的实际（测量的）协方差矩阵进行的扰动，以便相关性符合要求。现在，为采样数据找到一个零均值随机扰动，该扰动将使样本协方差矩阵向所需方向移动。

这可能有效，但除非我知道这实际上是手头的问题，否则我不会费心再深入研究它。（编辑：我对这个问题做了更多的思考，它似乎是一个二次规划问题，具有二次约束，以找到对矩阵 X 的最小扰动，使得得到的协方差（或相关）矩阵具有所需的特性。）

score 0 · Accepted Answer

Woodship,

"First of all, are the pseudo-random deviates assumed to be normally distributed?"

yes.

"Perhaps the issue is the user insists that the resulting random matrix of deviates must have correlations in the specified range."

Yes, that's the whole difficulty

"You must recognize that a set of random numbers will only have the desired distribution parameters in an asymptotic sense."

True, but this is not the problem here: your strategy works for p=2, but fails for p>2, regardless of sample size.

"If your desire was that the correlations must ALWAYS be greater than -0.188, then this sampling technique has failed, since the numbers are pseudo-random. In fact, that goal will be a difficult one to achieve unless your sample size is large enough."

It is not a sample size issue b/c with p>2 you do not even observe convergence to the right range for the correlations, as sample size growths: i tried the technique you suggest before posting here, it obviously is flawed.

"You might employ a simple rejection scheme, whereby you do the sampling, then redo it repeatedly until the sample has the desired properties, with the correlations in the desired ranges. This may get tiring."

Not an option, for p large (say larger than 10) this option is intractable.

"Compute the correlations. I they fail to lie in the proper ranges, then identify the perturbation one would need to make to the actual (measured) covariance matrix of your data, so that the correlations would be as desired."

Ditto

As for the QP, i understand the constraints, but i'm not sure about the way you define the objective function; by using the "smallest perturbation" off some initial matrix, you will always end up getting the same (solution) matrix: all the off diagonal entries will be exactly equal to either one of the two bounds (e.g. not pseudo random); plus it is kind of an overkill isn't it ?

Come on people, there must be something simpler

score 0 · Accepted Answer

这不是一个完整的答案，而是一种可能的建设性方法的建议：

查看正定矩阵（http://en.wikipedia.org/wiki/Positive-defined_matrix）的特征，我认为最实惠的方法之一可能是使用 Sylvester 标准。

您可以从一个具有正行列式的简单 1x1 随机矩阵开始，然后逐步将其扩展为一行和一列，同时确保新矩阵也具有正行列式（如何实现这一点取决于您 ^_^）。

math - 如何生成对非对角元素有约束的伪随机正定矩阵？

3 回答 3

Related

Reference