我想使用拉丁超立方采样获得空间填充样本。
在pyDOE的文档中,这可以通过使用数据分布的逆累积分布来完成。
如何将其扩展到具有多个维度的多模态分布?我的实现如下:
import numpy as np
import matplotlib.pyplot as plt
from pyDOE import lhs
from sklearn.datasets import make_blobs
#some basic two dimensional, multi-modal data
X, _ = make_blobs(n_samples=1000, centers=[(-5, -5), (0,0), (-5,5), (2, -8)], n_features=2, random_state=0)
#get the Latin Hypercube samples from 2 dimensions
#treat these as probabilities
probs_of_points = lhs(2, samples=10)
#for each sampled point, get the inverse cumulative distribution using the data.
vals = []
for point_idx in range(probs_of_points.shape[0]):
point = probs_of_points[point_idx,:] #each sampled point
vals_ = [np.quantile(X[:,x], point[x]) for x in range(point.ravel().shape[0])] #get value from data for each dimension independantly
vals.append(vals_)
vals = np.array(vals)
#plot data with sampled points
plt.scatter(X[:,0], X[:,1])
plt.scatter(vals[:,0], vals[:,1])
plt.show()
问题
- 这是为这些数据实施 LHS 的正确方法吗?
- 是否有另一种使用 pyhton 获取空间填充样本的方法?